Data has been around us forever, but ever since the day Harvard Business Review announced that ‘Data Scientist is The Sexiest Job of the 21st Century’, the demand for a new job role — Data Scientist has peaked and HR departments across industries have been assigned with this toughest task of recruiting ‘Data Scientist’ which is almost as equivalent as ‘The Martians’ — the never-seens.
As the saying goes, ‘Make Hay while the Sun shines’, Software Engineers and Engineering Graduates, started *pivoting* their career to become ‘Data Scientist’ so that their pay can increment 2x or 3x (and it did) hence a new industry of ‘Bootcamps to become Data Scientist’ and ‘Paid Courses to become Data Scientist’ starting popping up all around us, leaving an aspirant perplexed— sometimes highly frustrated that many of those aspirants dropped off in the early stage of the funnel to become a data scientist.
Image Courtesy: http://blog.edx.org/the-importance-of-data-science-in-the-21st-century
Hence as a data science practitioner (supposedly), I decided to sketch out a minimalistic learning path to become a data scientist (that also seem to have a better success rate, in my experience)
Here’s the proposed Learning Path:
Introduction to R — MSFT Course with DatacampIntroduction to Python for Data Science
Time to start with Data Analysis — the most painstaking process of Data Science but the availability of excellent packages/modules in both R and Python makes it easier for anyone — at this stage, Familiarity with RStudio/Jupyter Notebooks is appreciable R: dplyr, tidyr, stringr, reshape2 (Tidyverse packages)Python: Numpy, Pandas
Life is always boring without tangible results to feel good, so it’s high time for Data Visualization R: ggplot2 (the unbeaten king), for interactive visualization: rbokehPython: matplotlib, for interactive visualization: bokeh
Machine Learning begins — but with Statistics to get right with basics Introduction to Statistical LearningOpenIntro Statistics
That’s it, If you’ve successfully reached this part, You’ve successfully become a Data Scientist (Entry-level though) and from this you can start your journey into Advanced Machine Learning (Bagging/Boosting/Ensemble techniques, Feature Engineering, Dimensionality Reduction) and reach the deep world of Deep Learning (Artificial Neural Networks, Convolutional Neural Nets and more)
While this is all about learning, Learning works better when it’s tightly-coupled with practice-implementation-feedback cycle — which either could be a combination of new repos in your github + blogposts at each stage + a capstone project or a kaggle competition/analytics hackathon
Finally, a book worth mentioning (that doesn’t contain codes in it) is, Data science for Business — a brilliant must-read for anyone who wants to get into this domain of Data Science.
Image Courtesy: Goodreads
More Resources: