Wiki-Talk Network Analysis

Title: Crawling Wikipedia Graph

Collaborators: Maria Mitkina (Foster Bussiness School), and Chase Gottlich (Institute of Health Metrics & Evaluation)

Abstract:

Mining large graphs reveals information; temporal network of the same reveal evolution. However, performing novel algorithms on these large graphs can be computationally expensive. We need methods that can provide an un-biased sample that would be representative of the underlying large network. In this work, we evaluate different random walks by crawling a large online editing network – Wikipedia.

Findings:

Clustering of the graph associated with high growth in the platform.
Simple Random Walk is ineffective when sampling graphs with high tailed distribution.
Re-Weighted Random Walk outperforms other methods for graph sampling.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
archive		archive
final-project-crawling		final-project-crawling
utils		utils
.DS_Store		.DS_Store
.Rhistory		.Rhistory
LICENSE		LICENSE
README.md		README.md
final_csss_proj.pdf		final_csss_proj.pdf
summary_stats.R		summary_stats.R
superusers.R		superusers.R
use_aggregator_example.R		use_aggregator_example.R
use_statgrapher_example.R		use_statgrapher_example.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wiki-Talk Network Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Wiki-Talk Network Analysis

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages