Data has been around us forever, but ever since the day Harvard Business Review announced that ‘ ’, the demand for a new job role — Data Scientist has peaked and HR departments across industries have been assigned with this toughest task of recruiting ‘Data Scientist’ which is almost as equivalent as ‘The Martians’ — the never-seens. Data Scientist is The Sexiest Job of the 21st Century As the saying goes, ‘Make Hay while the Sun shines’, Software Engineers and Engineering Graduates, started *pivoting* their career to become ‘Data Scientist’ so that their pay can increment 2x or 3x (and it did) hence a new industry of ‘Bootcamps to become Data Scientist’ and ‘Paid to become Data Scientist’ starting popping up all around us, leaving an aspirant perplexed— sometimes highly frustrated that many of those aspirants dropped off in the early stage of the funnel . Courses to become a data scientist Image Courtesy: http://blog.edx.org/the-importance-of-data-science-in-the-21st-century Hence as a data science practitioner (supposedly), I decided to sketch out a minimalistic learning path to become a data scientist (that also seem to have a better success rate, in my experience) Here’s the proposed Learning Path: Pick a Language — — R works very well for non-techies and Python works well for techies R or Python Understand the that’s been picked — Data Types, Loops, Conditions, Functions basics of the language Introduction to R — MSFT Course with Datacamp Introduction to Python for Data Science Time to start with — the most painstaking process of Data Science but the availability of excellent packages/modules in both R and Python makes it easier for anyone — at this stage, Familiarity with / is appreciable R: , , , (Tidyverse packages)Python: , Data Analysis RStudio Jupyter Notebooks dplyr tidyr stringr reshape2 Numpy Pandas Life is always boring without tangible results to feel good, so it’s high time for R: (the unbeaten king), for interactive visualization: Python: , for interactive visualization: Data Visualization ggplot2 rbokeh matplotlib bokeh Machine Learning begins — but with to get right with basics Statistics Introduction to Statistical Learning OpenIntro Statistics Machine Learning rises — with the most frequently used Machine (Supervised/Unsupervised) Learning techniques or Algorithms (this is the place where Python comes handy because Python has only one central module while R has many — to perform model building, nevertheless it’s easier in both) Learn to build models scikit-learn Linear Regression Logistic Regression Decision Trees KNN (K- Nearest Neighbors) K-Means Clustering Market Basket Analysis (Associative Rule Mining) Naïve Bayes That’s it, If you’ve successfully reached this part, You’ve successfully become a Data Scientist (Entry-level though) and from this you can start your journey into (Bagging/Boosting/Ensemble techniques, Feature Engineering, Dimensionality Reduction) and reach the deep world of (Artificial Neural Networks, Convolutional Neural Nets and more) Advanced Machine Learning Deep Learning While this is all about learning, Learning works better when it’s tightly-coupled with cycle — which either could be a combination of new repos in your + at each stage + a capstone or a /analytics hackathon practice-implementation-feedback github blogposts project kaggle competition Finally, a book worth mentioning (that doesn’t contain codes in it) is, — a brilliant must-read for anyone who wants to get into this domain of Data Science. Data science for Business Image Courtesy: Goodreads More Resources: R for Data Science Datacamp Free & Paid Courses Data Science Specialization by JHU in R Introduction to Data Analysis in Python Data Science iPython Notebooks awesome-R Scikit-learn Video Series Hacker Noon: Data Science