Analysing Freedom Hosting II data with Marija

Written by remco_verhoef | Published 2017/02/06
Tech Story Tags: elasticsearch | data-visualization | freedom-hosting

TLDRvia the TL;DR App

Probably you did hear about the hack of Freedom Hosting II. Anonymous compromised servers and after initially selling the data for 0.1 bitcoin they published the database shortly for free.

The Verge published this about the hack:

Visitors to more than 10,000 Tor-based websites were met with an alarming announcement this morning: “Hello, Freedom Hosting II, you have been hacked.” A group affiliating itself with Anonymous had compromised servers at Freedom Hosting II, a popular service for hosting websites accessible only through Tor. Roughly six hours after the initial announcement, all the sites hosted by the service are still offline.

Read more at: http://www.theverge.com/2017/2/3/14497992/freedom-hosting-ii-hacked-anonymous-dark-web-tor

The database contains all kind of Wordpress websites, phpbb and other forums. There is a lot of ugly data in the databases, but because the full database is available anyhow, we’ve put it online for the purpose of research and analyses. Be careful while researching, because the database is uncensored and unfiltered.

Using the tool DB2ES (https://github.com/dutchcoders/db2es) we’ve exported all MySQL databases to Elasticsearch. This allows us to query and search quickly through all data. Now we can use Marija (also opensource and available at https://github.com/dutchcoders/marija) to visualise the data. We’ve made the datasource available in our demo environment for your convenience. See http://demo.marija.io/.

Marija is still a little bit rough to use, so here a small introduction about how to use it. Enable the freedomhosting2 datasource by clicking on the eye to right. First you need to refresh the fields using the refresh icon. All field names are prepended by there table name. Now you can add some fields. For example if you want to see what users are on multiple hidden Wordpress sites using different accounts or different aliases. Add the following fields:

  • schema
  • wp_users.user_email
  • wp_users.display_name
  • phpbb_users.user_email
  • mybb_users.email

Schema will arrange that all users will be linked to the originating database. The fields of wp_users, phpbb_users and mybb_users will link the different databases together on email address.

When you type the following query into the “_exists_:wp_users.user_email” you’ll get all all records returned which contain a wp_users.user_email field. Now you’ll see how many users some of the wordpress sites have and how those are related. You’ll see two nodes connecting two different sites, namely “[email protected]” and “herpderp420". Both are known on two different sites, one with different aliases but same email, the other with different email but same alias.

When you’ll open the table pane on the right, you can view the contents in depth. For your safety all html tags (like images and scripts) are being filtered. Clicking on the pluses will add the field as a column.

Visualisation of the wp_users

Because we are using Elasticsearch as a datasource, you’ll be able to use every Elasticsearch query you can think of. Like:

  • schema:anonysfmoe4pgsth
  • using keywords
  • the existence of specific fields: _exists_:wp_users.user_email

We’ve published the complete list of all table names (and field prefixes) in Github.


Published by HackerNoon on 2017/02/06