I built the web-site where you can explore the Stack Overflow questions referenced in the source code in . Check it out in TL;DR Github http://sociting.biz I am a big fan of Cloud Platform, especially I love its data warehouse implementation called BigQuery. In summer of 2016 Github and Google made the open-source data available for everyone in BigQuery, here are the mind boggling numbers: Motivation Google This 3TB+ dataset comprises the largest released source of GitHub activity to date. It contains a full snapshot of the content of more than 2.8 million open source GitHub repositories including more than 145 million unique commits, over 2 billion different file paths, and the contents of the latest revision for 163 million files, all of which are searchable with regular expressions. I have since then never tired of exploring these data, revealing interesting patterns or extreme samples and publishing about my findings.In December of 2016 Google team has woken up my “researcher within” again — they’ve added Stack Overflow’s history of questions and answers to the collection of public datasets on BigQuery. In practice that means that the most popular programming chat in the world is now can be analyzed with the power of Google Cloud Platform, for example one can run the on the Stack Overflow data and find out that Python developers post the lowest percent of negative comments overall! What excites me the most though is the ability to join the Stack Overflow data with other . For example one can try to find out whether the weather can affect the probability of a Stack Overflow question to be answered by using the data from am actually going to conduct this research soon).In the to the Stack Overflow data availability provided the sample of joining Github and Stack Overflow data to find out which are the most referenced Stack Overflow questions in the GitHub code — specifically, Javascript. It gripped my attention because I noticed a couple of limitations: * The query searches only for stackoverflow.com/questions/([0–9]+)/ pattern in the source code. However, there are alternative forms of referencing questions : it could a short form stackoverflow.com/q/([0–9]+)/ and it could be the direct reference to one of the answers, like stackoverflow.com/answers/([0–9]+)/* The query deals only with JavaScript sources, but there are plenty of other programming languages. articles sentiment analysis publicly available data sets NOAA dataset (I introduction Felipe Hoffa So, I set out to build the catalog of the stack overflow questions referenced in the GitHub sources for popular programming languages. Finding lines of code in Github Sources that have references to StackOverflow questions or answers. table that keeps contents for the top languages from the top repos was kindly provided by Felipe Hoffa Getting the data Step 1 contents_top_repos_top_langs The result has been saved in the new table called which contains the rows like so_ref_top_repos_top_langs Join the result with the StackOverflow data. The query should handle both of the questions id’s and answers id’s extracted from the source code Step 2 The result contains the rows the look like There were roughly 31K records like this one, the next question was on how to visualize them. First of all I moved the resulting data to the SQLite database by creating separate table for each programming language. Then I built the [web-site]( ) that allows to navigate through the data by switching between the languages and jump to the Github source code to check how the information from the questions/answers was applied in the specific scenarios. I also caught this opportunity to play with ASP.NET Core and implement the web-site on my Macbook Pro, without Windows being involved. The resulting application uses the cross platform ASP MVC Web API on the back end and react+redux on the front end. The source code is fully available in the . Building the web-site http://sociting.biz Github repo is how hackers start their afternoons. We’re a part of the family. We are now and happy to opportunities. Hacker Noon @AMI accepting submissions discuss advertising &sponsorship To learn more, , , or simply, read our about page like/message us on Facebook tweet/DM @HackerNoon. If you enjoyed this story, we recommend reading our and . Until next time, don’t take the realities of the world for granted! latest tech stories trending tech stories