Skip to content

rarblack/masterthesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GRAPH-BASED VISUALIZATION & ANALYSIS OF AZERBAIJANI WEB


A Thesis
Presented to the Graduate Program of Computer Science and Data Analytics of the School of Information Technology and Engineering
ADA University

In Partial Fulfillment
of the Requirements for the Degree
Master of Science in Computer Science and Data Analytics ADA University

ABSTRACT

A full understanding of the local Azerbaijani web space is necessary to analyze information flow patterns and influences in the local network and review the dependency of Azerbaijan on external sources in case of cyber-attacks or national emergencies. To develop this knowledge and to create efficiency in local data collection processes, a web crawler with a subsequent graphical analysis is a must. The goal of this research is to create a big graph of Azerbaijani web, analyze its linkages and most influential nodes. This study aims to develop a catalog of local websites, create a web crawler to browse each web page and outgoing links, construct a graph-based visualization with valuable information and apply a ranking algorithm to measure the influence scores. A multiprocessing program in Golang is developed to crawl the database of local webpages supplied by the Ministry of Communication & Information Technologies. The program consists of a master, multiple concurrent workers, and a Postgres database. The constructed graph consists of nodes representing web pages, and edges which are connections in-between. A page ranking algorithm is implemented to measure the importance of nodes. The observations are such that the graph is not too strongly connected, and governmental web pages are the most linked ones due to redirections to various services.

Keywords: web crawling, graph theory, big data, page ranking, multiprocessing

Link to the Research Paper:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published