Congratulations! Data Science is a career that’s hottest, hardest, most challenging, most rewarding, and full of top-notch minds. Your journey is bound to be full of fun, challenges, enlightenment, and achievements (big or small). New papers are published daily or even hourly. New techniques and experiments are developed regularly. New ways of thinking become the new norm. And what seems magical before, are proven feasible.
But getting into Data Science is not easy. Far from it. The learning curve is brutal. There is so much to learn: Linear Algebra, Calculus, Statistics, Python, SQL, Machine Learning, Algorithm, Optimization, Data Wrangling, Data Visualization, Software Engineering, DevOps, … The list goes on and on.
Some people may have some background in math or statistics, which will definitely help. Yet you still need a solid foundation for software engineering to be efficient and be successful in your career. But this is not a problem, you say. After all, we live in an era of booming online education. There are plenty of courses paid and free we can choose. True, but this is precisely where the problem is. The biggest challenge for self-education these days is not lack of education resources, but hard to find the best or most relevant ones.
What is CS50? It is the introductory course on computer science taught at Harvard University by Professor David J. Malan. It is the largest class at Harvard with 800 students, 102 staff, and a professional production team. It offers both an on-campus and an online course. I’ve taken the online one, but it’s already THE best computer science course I came across, period. Let me tell you why:
The CS50 staff has the capability of knowing precisely what you do and do not know before each lecture (in that they have zero expert blindness). So the speech will not mention anything you are not familiar with. It smoothly guides you through key concepts of computer science and makes it seem obvious. It raises questions from time to time and later addresses them with a more in-depth explanation of the concepts. You’ll have plenty of ‘a-ha’ moments, and it almost felt like watching a suspense movie.
The course covers most of the critical computer science elements: C, Python, Data Structures, Algorithms, Software Engineering, Resource Management, Web Development, etc. It delves down deep enough so you can understand all the essential concepts while also know where to look if you want to dig deeper.
CS50 has many ways to teach and keep you engaged. You’ll play a game to understand different sorting algorithms, receive a rubber duck to experience the famous , watch experiments of ‘array of lights 🚥’ to learn data structure, even eat a delicious breakfast 🍞 while exploring the idea of pseudo-code. (One of my favorites is where David J. Malan uses a Yellow-page phone book to explain binary search and tears down half of the book and throw it away. A definitive moment in CS50 indeed. )
The learning experience is so fun you’ll feel the time fly by without noticing it. Some of the problem set it gives are quite challenging, yet not impossible. And you’ll feel so proud of yourself once you cracked it. You’ll probably fall in love with the joy of problem-solving. If you are stuck, there is an online community on almost every social network platform (Twitter, Reddit, Stack Exchange, Facebook, etc.) where you can get help.
Puzzle days, office hours, CS 50 Fairs, the final project ‘All-nighter’ hackathon (free breakfast at IHOP if you stay up all night), lots of activities designed to get you familiar with the ‘developer culture’ and better prepare you for the software engineering world.
How great is a computer science course if they don’t use the software tools they developed themselves? Over the years, CS50’s staff has developed a series of tools/software to help the students write code, submit homework, check their code quality/syntax, tidy up code styles, and even generate color-coded code documentation in PDF form! These are all neat and useful ‘training-wheels’ as David J. Malan puts it and will help you get up to speed.
So essentially nothing taught in the course is not somewhat useful to you, and the foundation it helps you build will go a long way.
Once you finished the course, you’ll be more knowledgeable and confident to continue your Data Science journey, and I’ll point you to a couple of possible directions from here:
CS50’s Web Programming with Python and JavaScript
Teaches you the most relevant and progressive web programming tools like CSS, Javascripts, React, Flask/Django, by the talented TF Brian Yu. Link here.
Jeremy Howard’s Fast.ai course to Start a ‘Top-down’ Approach for ML
Fast.ai is fantastic and unique. It enables you to build state-of-the-art deep learning models within the first lesson with less than ten lines of code. Then it delves down deeper and deeper on the how and why. The only prerequisite is one-year of coding experience, which CS50 would have already prepared you with.
Andrew Ng’s Machine Learning Course at Coursera
Another great Machine Learning course, but a ‘Bottom-up’ style. It smoothly explains the math fundamentals first and gradually builds up the knowledge to piece together complicated machine learning models from scratch. I have an article that explains the difference between Andrew Ng and Jeremy Howard’s different approaches to machine learning education and recommend a potentially efficient way to learn.
Corey Schafer’s YouTube channel, Python and OOP Tutorials
As good as it is, CS50 only covers the generic and basic concepts of Python. You’ll need more in-depth knowledge to code efficiently for your data science projects. For this, I recommend Corey Schafer’s YouTube channel. He is one of the best Python educators I came across to explain complicated ideas in a crystal clear way. Not one second of his videos is wasted. The content is concise, to the point, and highly condensed. He has playlists for basic Python, SQL, Matplotlib, Git, and Object-Oriented Programming.
Learning Data Science is never a breeze, and I hope this article will help a little in alleviating the pain and make your journey a bit more efficient and fun. If you know other courses and resources that are also great, please feel free to leave a response so others can also see. Thanks!
Found this article useful? Follow me (Michael Li) on Medium or you can find me on Twitter @lymenlee or my blog site wayofnumbers.com. You could also check out my most popular articles below! Please 👏 this article to share it!
Two Sides of the Same Coin: Jeremy Howard’s fast.ai vs Andrew Ng’s deeplearning.aiI finished Andrew Ng’s Machine Learning Course and I Felt Great!