Image courtesy of Annie Spratt on Pixabay.
My 8-year-old nephew once asked me, “what do you do for a living?” I told him that I was a data scientist. He then asked me “how many beakers do you have?” Besides thinking that kids say the darndest things, I quickly realized this child had absolutely no clue about what I did. Why would he?
That’s why I wrote The ABCs of Data Science — to create exposure about Data Science amongst the next generation and because it will be a critical discipline for the future workforce. But also, tell a few jokes and make fun of people like myself.
So, if you had to tell someone what I do, how would I explain data science concepts to a friend not in the field? I would probably use statistics concepts like “average” or “median” to get some of the concepts across. Now, how about explaining it to my nephew? Tough, right?
The ABCs of Data Science — Letter P
It took me months (even years if you count my first rough notes) to write the ABCs of Data Science, a book filled with fun illustrations and simple definitions of Data Science that can be understood relatively easily by young kids but also is very relevant for adults.
Going into it, our task seemed simple: 26 letters in the alphabet, 26 definitions, and some fun illustrations to go along with it. Over time, the book became an iterative process (yes, I for Iteration also made it) of writing definitions, my friend reviewing it, and then me rewriting and so went the cycle.
This was the story of the summer until we brought in some more expert Data Scientist friends who cemented the definition and gave us more ideas for illustrations. Anyway, let’s get to what’s inside!
The All-Star Office Team of the ABCs of Data Science
One of my favorite parts of writing the book was the characters that take you through the book’s journey and really bring it to life.
Data Science Dolphin (nothing less than Wonder Woman, I am a DC comics guy) was just waiting for her solo movie after appearing in the ABCs of Product Management (written by my editors).
We really wanted to create some new characters as part of the journey in assisting DS Dolphin and created Analyst Armadillo and Z-Score Zebra to take the reader through the Data Science definitions.
Let’s look at a definition: G for Gaussian Distribution!
A Gaussian Distribution (also called a Normal Distribution) is the name for a bell-shaped curve that describes many different types of data sets. Once a Data Scientist recognizes the shape, they can make better predictions about the rest of the data!
When I got to writing G (which I skipped and then came back to), I had no words left, quite literally, and was quite dumbfounded as to how hard it was to put this into simple words. How do we explain what a distribution is, since D is for Data (not distribution) in the book, and do we stress the fact that Gaussian is a Distribution, or do we explain how important the Gaussian Distribution is in all of Statistics and Data Science?
The one thing I learned writing these definitions (imagining a 9 or 10-year-old kid was reading the book) is that you just have to lay it all out in simple language. I think the definition we ended up with gets the point across — “Describes different types of data sets”, a Distribution, and “they can make better predictions about the rest of the data”.
A Normal Distribution usually makes it easier for the data scientists to understand certain properties of the data set.
The book is now available on Amazon! First, I would highly recommend checking out our website for an inside look into the characters and some of the fun definitions! We even did a live reading on Facebook which will hopefully give you more of a background on why we wrote the book.
So why did we write the book and why a kids book on Data Science? Data Science, as we know, is one of the most sought-after positions, let alone in tech and for great reason:
Data Science encompasses Machine Learning, AI, Mathematics, Statistics, Programming, etc, all fields that will continue to transform companies and shape our future.
For me, it is important that we start early and get kids curious as early as elementary school. Ultimately, we hope that the book might motivate your kid, cousin, niece, or nephew to pick up math or programming, inspire even one of them, if we’re lucky.
About me: I am a Data Scientist and Product Manager at Komodo Health, a healthcare SaaS company. I have used analytics and machine learning at several companies to assist clients in clinical research, drug development and now manage to build data-backed products that provide healthcare solutions.
Feel free to reach out on LinkedIn or Instagram.
Previously published behind a paywall here.