Hypermapper

Novel hyperlink network analysis tool created for my research on the digital footprints of community mappers.

UI of the visualization editor

Features

  • Per-site configuration files written in js, coffeescript, or literate coffeescript
  • Ability to freeze and unfreeze in-progress crawls
  • A graph database to cache pre-scraped pages
  • A live-updating visualization editor for postprocessing and other site-specific tweaks using custom javascript
  • Force-directed graph visualizations powered by cytoscape.js
  • Custom blacklists, whitelists, siteDepth, and more

History

I became interested in hyperlink network analysis in 2019 while studying under Gavin Shatkin at the School of Public Policy and Urban Affairs at Northeastern University. What started as an attempt to understand the digital networks of organizations that study physical social networks (like the Asian Coalition on Housing Rights) quickly spiraled into a year-long side project evaluating the shortcomings of existing hyperlink network analysis software and eventually creating my own.

In brief, my findings were that variations in the layout of individual websites had a dramatic effect on the resulting generated network graph. These variations, despite affecting the resulting network graph, were generally completely hidden by all the existing network analysis tools at the time. As an example, the analysis tools would pick up some website footers full of links, but other sites’ footers would be filtered out. My solution to this was a custom crawler with extensive configuration options and a post-crawl editor for site-specific adjustments. This allowed me to have a higher level of confidence that I was in fact "mapping" these broad network graphs and not just uncovering the differences in site layout and structure.