A couple of months ago I was at the AWS Summit and had a conversation with Looker team. We’ve talked a lot about data analytics, existing solutions and approaches and before I left their representative gifted me a book that I want to review today. The book is called ‘Winning with data’ and it’s great because it gives you a clear path how to start working with data you own but probably don’t how to use properly.
That was another reason why I enjoyed this book so much — it came right in time because at Scentbird we’ve reached that stage when we have a lot of actionable data, and it’s important to know screw it up from the beginning. And that’s one of the problem authors highlighted in the beginning. To be precious, there are four significant problems with data:
These problems were the same for the different organizations, so that’s why we can find similar solution all over the internet: - Google with Sawzall and Dremel- Facebook with HiPal- AWS Redshiftand many others.
But you shouldn’t be at that scale to become data-driven. There are some examples from SMBs:- The RealReal uses reports on a daily basis to understand what’s going on with their operations and revenue and how they should align their marketing and merchandise plan to improve it.- ThredUp named one of America’s Most Promising Companies, has very complex operations, that requires to process and catalog thousands of items every day, so data-driven approach helps their teams to meet their KPI, understand customer engagement and see general trends.And there are more examples from Zendesk, Warby Parker, HubSpot, and DonorsChoose.
So what does it take to become a data-driven company? Usually, there are some basic steps:
But even the best tool is just a tool, and you should be very careful to avoid data biases. The most common is a “survivorship bias.”
During World War II, the statistician Abraham Wald took survivorship bias into his calculations when considering how to minimize bomber losses to enemy fire. Researchers from the Center for Naval Analyses had conducted a study of the damage done to aircraft that had returned from missions and had recommended that armor be added to the areas that showed the most damage. Wald noted that the study only considered the aircraft that had survived their missions — the bombers that had been shot down were not present for the damage assessment. The holes in the returning aircraft, then, represented areas where a bomber could take damage and still return home safely. Wald proposed that the Navy instead reinforce the areas where the returning aircraft were unscathed, since those were the areas that, if hit, would cause the plane to be lost. (Wikipedia)
There is an interesting case of fighting “survivorship bias” at Facebook.
One of the approach to avoid this is to teach all your team in the same way like it’s done in Zendesk:
When these steps are done, it finally comes to the most interesting part — asking the right questions.
https://logianalytics.com/definitiveguidetoembedded/the-future-of-embedded-analytics/
And that’s we can find in between steps 2 and 3 in the Gartner data sophistication journey. This missing step is called Exploratory Data Analysis described by John W. Tukey almost 40 years ago. One of the most important things about exploratory data analysis is that it’s easy to understand how good your questions is, just answer “What decisions would that analysis inform?” So, no matter what type of analysis you perform a proper action should always follow it.
And last but not least — you should know how to present your data, so it’s easy to understand, it’s not boring, and it’s actionable. I plan to cover this topic in the next book review about data visualization.
“Winning with data” is a great, interesting book I recommend you to read.
P.S.: This book has a lot of links to useful information, some of them I want to share: