A journey from the origins of computing and data analytics to what we now call the "Modern Data Stack". What comes next?
People Mentioned
A journey from the origins of computing and data analytics to what we now call the "Modern Data Stack". What comes next?
The Origins of Computing and Data Analytics
The origins of computing and data analytics began in the mid-1950s and started taking shape with the introduction of SQL in 1970:
1954: Natural Language Processing (NLP) - “Georgetown-IBM experiment”, machine translation of Russian to English.
1960: Punch Cards
1970: Structured Query Language (SQL)
1970s: Interactive Financial Planning Systems - Create a language to “allow executives to build models without intermediaries”
1972: C, LUNAR - One of the earliest applications of modern computing, a natural language information retrieval system, helped geologists access, compare and evaluate chemical-analysis data on moon rock and soil composition
1975: Express - The first Online Analytical Processing (OLAP) system, intended to analyse business data from different points of view
1979: VisiCalc - The first spreadsheet computer program
1980s: Group Decision Support Systems - “Computerized Collaborative Work System”
The “Modern Data Stack”
The "Modern Data Stack" is a set of technologies and tools used to collect, store, process, analyze, and visualize data in a well-integrated cloud-based platform. Although QlikView was pre-cloud, it is the earliest example of what most would recognize as an analytics dashboard used by modern platforms like Tableau and PowerBI:
Paper, Query Languages, Spreadsheets, Dashboards, Search, what next?
Some of the most innovative analytics applications, at least in terms of user experience, convert human language to some computational output:
Text-to-SQL: A tale as old as time, LUNAR was first developed in the 70s to help geologists access, compare, and evaluate chemical analysis data using natural language. Salesforce WikiSQL introduced the first extensive compendium of data built for the text-to-SQL use case but only contained simple SQL queries. The Yale Spider dataset introduced a benchmark for more complex queries, and most recently, BIRD introduced real-world “dirty” queries and efficiency scores to create a proper benchmark for text-to-SQL applications.