Data Science, which is also known as the sexiest job of the century, has become a dream job for many of us. But for some, it looks like a challenging maze and they don’t know where to start. If you are one of them, then continue reading. In this post, I’ll discuss how you can start your journey of Data Science from scratch. I’ll explain the following steps in detail. Learn the basics of programming with Python Learn basic Statistics and Mathematics Learn Python for Data Analysis Learn Machine Learning Practice with projects Learn the basics of programming with Python If you are from an IT background, you are probably familiar with programming with Python, in which case you can skip this step. But if you’re yet not exposed to the fun of coding, you should start learning Python. It’s the easiest to learn of all programming languages and is widely used for development as well as data analytics. To begin with, you can search for free online tutorials that will help you understand the basics of Python. I’m listing a few links where you can learn Python on your own in a short period of time. You can try these out and choose for yourself. learnpython.org Google’s Python Class (Video Tutorials) Estudy free Python course (With online editor to code) Code Academy The list is not exhaustive and you can find many more resources on the web that can help you start learning the basics of Python. You can also find many YouTube channels that have Python tutorials for beginners. Once you are familiar with the syntax and other basics of programming, you can continue learning the intermediate and advanced levels of Python. Although to be good at data science, I recommend you to complete at least the intermediate level, so you can be familiar with Data Structures and File Systems in Python. Let’s move on to the next step. Learn Statistics and Mathematics Data Science is the skill of analyzing the data and drawing useful and actionable insights. For that, you must have knowledge of basic Statistics and Mathematics. Now I’m not asking you to be a great statistician, but you should know the basics to understand important things like distribution of data and the working of algorithms. Having said that let’s see what you need to learn. First of all, go through your high school statistics so you can touch base again. For that, I recommend Khan Academy’s series of (optional if you are thorough and comfortable with it). High School Stats After brushing up your high school concepts, You can start reading any of the following books: (with R) (highly recommended) An Introduction to Statistical Learning (with Python) Think Stats The above links will directly take you to the respective pdf versions of these books. You can also purchase the physical copies as per your convenience. After having read one of these books, you will also get familiar with the fundamentals of Data Analysis which will help you in the next step. Note: Although I have asked you to learn Python to start your journey in data science, during the learning you would come across several other tools such as R which are also used for statistical computations and data analysis. My general advice is to always have an open mind for whatever you cross paths with. The underlying working and logic are generally the same if you are performing a task in two different languages. It’s only a matter of syntax and framework that varies. Having said that let’s move on to our first attempt at data analysis. Learning Python for Data Analysis This is where it gets interesting. Now that you know the basics of Python programming and the required Statistics, its time to finally get your hands dirty. If you want to learn without paying anything, just make an account on Udacity and sign up for their free course — . This course will introduce you to the useful Python libraries such as and , that are needed for Data Analysis. You can learn at your own pace and easily finish the course in a few weeks. Intro to Data Analysis Pandas Numpy There are many other courses on Udacity for you to explore. You can also find Nanodegree programs offered by Udacity, for which you generally have to pay. If you are comfortable paying for learning, there are many good platforms such as Coursera, Dataquest, Datacamp, etc. By the end of this step, you should be familiar with some important libraries of Python and data structures like , , and . You should also be able to perform tasks like data wrangling, drawing conclusions, vectorized operations, grouping data, and combining data from multiple files. Series Arrays DataFrames Although you are now ready for the next step, there is still one thing left to be learned before moving on. The final key to bridging the gap between Analytics and Machine Learning — . Data Visualization Data Visualization is an important part of Data Analytics as it helps you draw conclusions and visualize patterns in the data. Therefore it is imperative to learn how to visualize data. The best and the simplest way to do so is to go through . After this, you will be familiar with an important Python library — Kaggle’s course of Data Visualization Seaborn. Note: Kaggle is a popular website among Data Scientists all over the world. It conducts timely contests to challenge the skills of data-savvies and also provides free interactive courses to help the budding data enthusiasts such as yourselves. Great! You have come more than halfway to learning Data Science. Let’s move on to the next step which is Machine Learning. Learn Machine Learning Machine Learning, as the name suggests is the process with which machine (computer) learns itself. It is the study of computer algorithms that improve automatically through experience. You build models mostly using predefined algorithms depending upon the kind of data and business problem you are facing. These models train themselves on a given data and are then used to draw conclusions on new data. The simplest way to go about learning Machine Learning would be to go through the following courses on Kaggle in the given order: Intro to Machine Learning Intermediate Machine Learning (to improve your models) Feature Engineering Although there are many other ways to learn Machine Learning, I have mentioned the easiest one for which you don’t have to pay. If money is not the constraint for you, you can explore various courses on DataCamp, Coursera, Udacity, and other related platforms. By the end of this step, you would understand the difference between and . You would also know various important algorithms such as , , , , etc. Supervised Machine Learning Unsupervised Machine Learning Regression Classification Decision Trees Random Forest Awesome! You just cracked the maze and joined the club of Data Science. Now all you have to do is to get better and climb up the ladder. Practice with projects If you are still reading this blog, you really have what it takes to become a successful Data Scientist. Once you have achieved all the knowledge, you must retain it and enhance it by practicing as much as you can. To do so, you can find projects to work on and business problems to solve. One of the best ways to stay in practice is by participating in Kaggle contests and solving the problems. Kaggle gives you the problem to be solved and the required data to work on. If it’s a contest, you can submit your results and get a rank in the leaderboard based on your score. You can also work on personal projects to build a portfolio of your own. You can try the following sources to explore datasets: Kaggle Datasets UCI Machine Learning Repository Amazon Datasets Google’s Datasets Search Engine To practice, I recommend you to download and install in your local machine. This is a great toolkit for doing your Data Science projects. You will find as one of the tools in Anaconda, which is a great way to build Python projects and showcase them in your portfolios. Anaconda Jupyter Notebook I am sure that following the guidelines in this blog would have helped you achieve the goal of learning data science. There’s a lot to learn and even more to explore in this field. Stay tuned. Previously published at https://towardsdatascience.com/data-science-from-scratch-4343d63c1c66

Amazon

Google

Timely

YouTube

What Is Simple Linear Regression

Read my articles to know about Data Science & Machine Learning

Data Science From Scratch

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

What Is Simple Linear Regression

10 Ways Stand Out as a Java Developer and Land that Dream Job

10 Fantastic JavaScript Projects for Beginners

11 Great Tips From A Guy Who Leveled Up From Intern To Dev

125 Stories To Learn How To Do X

3 Free Python Courses For Beginners: 2020 Edition

What Is Simple Linear Regression

10 Ways Stand Out as a Java Developer and Land that Dream Job

10 Fantastic JavaScript Projects for Beginners

11 Great Tips From A Guy Who Leveled Up From Intern To Dev

125 Stories To Learn How To Do X

3 Free Python Courses For Beginners: 2020 Edition

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps