Skip to content

kishorevasan/wikitalk-network-analysis

 
 

Repository files navigation

Wiki-Talk Network Analysis

Title: Crawling Wikipedia Graph

Collaborators: Maria Mitkina (Foster Bussiness School), and Chase Gottlich (Institute of Health Metrics & Evaluation)

Abstract:

Mining large graphs reveals information; temporal network of the same reveal evolution. However, performing novel algorithms on these large graphs can be computationally expensive. We need methods that can provide an un-biased sample that would be representative of the underlying large network. In this work, we evaluate different random walks by crawling a large online editing network – Wikipedia.

Findings:

  • Clustering of the graph associated with high growth in the platform.
  • Simple Random Walk is ineffective when sampling graphs with high tailed distribution.
  • Re-Weighted Random Walk outperforms other methods for graph sampling.

About

CSSS 567 - Statistical Analysis of Social Networks final project repo

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 96.3%
  • R 3.7%