A couple of months ago I was at the and had a conversation with Looker team. We’ve talked a lot about data analytics, existing solutions and approaches and before I left their representative gifted me a book that I want to review today. The book is called and it’s great because it gives you a clear path how to start working with data you own but probably don’t how to use properly. AWS Summit ‘Winning with data’ That was another reason why I enjoyed this book so much — it came right in time because at Scentbird we’ve reached that stage when we have a lot of actionable data, and it’s important to know screw it up from the beginning. And that’s one of the problem authors highlighted in the beginning. To be precious, there are four significant problems with : data — a lot of departments suffer because it’s hard to access the data; Data breadlines for the data-poor — there is no easy way to understand what kind of data is right for you; Data obscurity — big and medium organizations tend to create a lot of separated data sets that are not aligned; Data fragmentation problem — different teams treat the same data in a variety of ways. Data brawls These problems were the same for the different organizations, so that’s why we can find similar solution all over the : - Google with and - Facebook with - and many others. internet Sawzall Dremel HiPal AWS Redshift But you shouldn’t be at that scale to become data-driven. There are some examples from SMBs:- uses reports on a daily basis to understand what’s going on with their operations and revenue and how they should align their marketing and merchandise plan to improve it.- named one of America’s Most Promising Companies, has very complex operations, that requires to process and catalog thousands of items every day, so data-driven approach helps their teams to meet their KPI, understand customer engagement and see general trends.And there are more examples from Zendesk, Warby Parker, HubSpot, and DonorsChoose. The RealReal ThredUp So what does it take to become a data-driven company? Usually, there are some basic steps: — every time your business team wants something they ask one of the developers Ask the Engineer — your dev team creates a simple solution that allows exporting raw data (for example in CSV) Access Raw Data — in-house solution for solving data problems (sounds as a good solution, but still requires technical and business teams to work in a close touch) Bring Your Own BI (BYOBI) — next level BI that helps business team to access data in more or less convenient way (like , , , etc.) Data Fabric Looker RJMetrics Chartio But even the best tool is just a tool, and you should be very careful to avoid data biases. The most common is a “survivorship bias.” During World War II, the statistician Abraham Wald took survivorship bias into his calculations when considering how to minimize bomber losses to enemy fire. Researchers from the Center for Naval Analyses had conducted a study of the damage done to aircraft that had returned from missions and had recommended that armor be added to the areas that showed the most damage. Wald noted that the study only considered the aircraft that had survived their missions — the bombers that had been shot down were not present for the damage assessment. The holes in the returning aircraft, then, represented areas where a bomber could take damage and still return home safely. Wald proposed that the Navy instead reinforce the areas where the returning aircraft were unscathed, since those were the areas that, if hit, would cause the plane to be lost. ( ) Wikipedia There is an of fighting “survivorship bias” at Facebook. interesting case One of the approach to avoid this is to teach all your team in the same way like it’s done in Zendesk: — people learn the most basic language for asking data questions SQL — where to find all the different data sets Data architecture — a review of key metrics and their definitions Data dictionary — accounts of previous problems and how they were solved Case studies Basic statistical concepts — how to construct an argument with data and vizualisation Storytelling with data — determining whether this data analysis will result in a tangible change in the way the company operates Actionability When these steps are done, it finally comes to the most interesting part — asking the right questions. https://logianalytics.com/definitiveguidetoembedded/the-future-of-embedded-analytics/ And that’s we can find in between steps 2 and 3 in the Gartner data sophistication journey. This missing step is called described by John W. Tukey almost 40 years ago. One of the most important things about exploratory data analysis is that it’s easy to understand how good your questions is, just answer “What decisions would that analysis inform?” So, no matter what type of analysis you perform a proper action should always follow it. Exploratory Data Analysis And last but not least — you should know how to present your data, so it’s easy to understand, it’s not boring, and it’s actionable. I plan to cover this topic in the next about data visualization. book review “Winning with data” is a great, interesting book I recommend you to read. P.S.: This book has a lot of links to useful information, some of them I want to share: Data Warehousing and Analytics Infrastructure at Facebook With Amazon Redshift SSD, querying a TB of data took less than 10 seconds How Businesses Make Decisions and How They Could do it Better How Leading Organizations Are Adopting a Data-Driven Culture Why Google has 200m reasons to put engineers over designers Why Intuit Founder Scott Cook Wants You To Stop Listening To Your Boss Making data mean more through storytelling Resonate: Present Visual Stories that Transform Audiences