Originally Posted Here
Learning every facet of data science takes time. We have written pieces on different resources before. But we really wanted to focus on courses, or video like courses on youtube.
There are so many options, it can be nice to have a list of classes worth taking.
We are going to start with the free data science options so you can decide whether or not you want to start investing more in courses.
Tip : Coursera can make it seem like the only option is to purchase the course. But they do have an audit button on the very bottom. Now, if you appreciate Coursera, by all means, you should purchase their specialization, I am still uncertain how I feel about it. But, I do love taking Coursera courses.
Select the audit course option to not pay for the course
This course introduces you to sampling and exploring data, as well as basic probability theory and Bayes’ rule. You will examine various types of sampling methods, and discuss how such methods can impact the scope of inference. A variety of exploratory data analysis techniques will be covered, including numeric summary statistics and basic data visualization. You will be guided through installing and using R and RStudio (free statistical software), and will use this software for lab exercises and a final project. The concepts and techniques in this course will serve as building blocks for the inference and modeling courses in the Specialization.
In this Specialization, you will learn to analyze and visualize data in R and create reproducible data analysis reports, demonstrate a conceptual understanding of the unified nature of statistical inference, perform frequentist and Bayesian statistical inference and modeling to understand natural phenomena and make data-based decisions, communicate statistical results correctly, effectively, and in context without relying on statistical jargon, critique data-based claims and evaluated data-based decisions, and wrangle and visualize data with R packages for data analysis.
In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.
This course aims to teach everyone the basics of programming computers using Python. We cover the basics of how one constructs a program from a series of simple instructions in Python. The course has no pre-requisites and avoids all but the simplest mathematics. Anyone with moderate computer experience should be able to master the materials in this course.
This Specialization builds on the success of the Python for Everybody course and will introduce fundamental programming concepts including data structures, networked application program interfaces, and databases, using the Python programming language. In the Capstone Project, you’ll use the technologies learned throughout the Specialization to design and create your own applications for data retrieval, processing, and visualization.
This course will introduce the core data structures of the Python programming language. We will move past the basics of procedural programming and explore how we can use the Python built-in data structures such as lists, dictionaries, and tuples to perform increasingly complex data analysis. This course will cover Chapters 6–10 of the textbook “Python for Everybody”. This course covers Python 3.
This Harvard Certification program will teach you key data science essentials, including R and machine learning using real-world case studies to kick start your data science career. Spread across 9 courses, this immersive program is among the best rated online masters programs available on leading e-learning platform edX. The courses that make up this program include R Basics, Visualization, Probability, Inference and Modeling, Productivity Tools, Wrangling, Linear Regression, Machine Learning followed up with a Capstone project to test and try all that you learn in the course.
This course is described as a boot camp but without the 18–30k price tag. Now, this in no way replaces a boot camp. However, it is a very good intro for anyone who already has a CS or technical background who just needs to get up to speed quickly on data science concepts.
We would recommend this course for companies looking to help their own internal employees transition into new positions. You can upskill engineers or scientists quickly into a more rounded data proficient specialist. It also is much cheaper than paying a consultant to come in and teach your team (as we know as we have taught courses before and our price tag tends to be closer to 100 per person per day). This course is worth about 2 weeks of courses.
Yes, in person is usually better and more comprehensive as it allows for questions and more specific examples to be outlined. However, this is a great chance for people who are self-learners and who might just need a jumpstart.
This series of 5 courses will help you strengthen your foundation of data science, statistics and machine learning. You will learn to analyze big data and understand how to make data-driven predictions through statistical inference and probabilistic modeling to extract meaningful data for decision making. Journey will begin from the very basics of probability and statistics before moving on to data analysis techniques and machine learning algorithms. It is advisable to have college-level calculus, mathematical reasoning, and python programming proficiency to make the most of this certification. You may apply to a variety of job roles after the completion of this certification including that of a data scientist, data analyst and system analyst to name a few.
Andrew Ng, former head of Google Brain and Baidu AI Group has created this course along with other professors from Stanford University. It is one of the most sought after courses and certifications around machine learning available online. You will learn about Supervised learning, Unsupervised learning among other key areas and the course includes multiple case studies and applications to help you learn how to apply algorithms to build smart robots. This is one of the best data science courses you can opt for.
This professional program by Microsoft consists of 9 courses in addition to a project and will take about 16–32 hours per course. It is a 10 course program and you can also choose individual courses if you want. You will learn about using Microsoft Excel to explore data, using Transact-SQL to query a relational database, creating data models using Excel or Power BI, applying statistical methods to data and using R or Python to explore and transform data Follow a data science methodology. The program is broken into 4 major units which further consist 10 courses. It is all followed by a project to help you apply all that you learn through the duration of this course.
This course is comprehensive and discusses both Python and R. This isn’t just focused on Scikit learn but machine learning in general. In addition, the creator of this course is the owner of SuperDataScience.com this is a great site with a podcast, lessons and more. So if you don’t want to pay for the course, you can always listen to the podcasts for free!
Python, of course, is not the only language for data science. Another popular language is R (also, these aren’t the only 2 languages, there are other languages people like to use…except Matlab..we don’t talk about Matlab)
REVIEW: Machine learning a-z is a great introduction to ml. A big tour through a lot of algorithms making the student more familiar with scikit-learn and few other packages…. Ml-az is a right course for a beginner to get the motivation to dive deep in ml.
Frank Kane has another great course on this topic where he will cover more than the book mentioned above. He will also discuss Ensemble Learning and bias trade-offs. Plus, if you are a visual learner, this will probably benefit you more. There’s also an entire section on machine learning with Apache Spark, which lets you scale up these techniques to “big data” analyzed on a computing cluster.
The previous Kirill Eremenko course we mentioned was more comprehensive and theoretical. It didn’t go into the process of data science. This course instead goes through several tools that are used by data scientists and BI engineers. This course is mixed with a little BI, but practically, a lot of “data science” roles require BI skills. So depending on the company and role you are looking for, this is a great fit.
REVIEW: It is an excellent course for people who are super excited about data science. But I would say this course should include r or python exercises as well. Data visualization, data preparation and data communication parts are awesome but data modeling section is a bit weak in the sense that it does not have an impact on real-world problems but this part of the course needs improvement. However, the course engaging and you learn a lot from it after completing it.
OK, let’s say you really are interested in deep learning and not data science. Well then this course is a great overall.
As it discusses, Artificial intelligence is growing exponentially. There is no doubt about that. Self-driving cars are clocking up millions of miles, IBM Watson is diagnosing patients better than armies of doctors and Google Deepmind’s AlphaGo beat the World champion at Go — a game where intuition plays a key role.
But the further AI advances, the more complex become the problems it needs to solve. And only Deep Learning can solve such complex problems and that’s why it’s at the heart of Artificial intelligence
Inside this class we will work on Real-World datasets, to solve Real-World business problems. (Definitely not the boring iris or digit classification datasets that we see in every course). In this course we will solve six real-world challenges:
Not everyone interested in data science, want to be researchers. Some people want to be more developers, engineers. This arguably a very different data scientist types, some people just prefer building, automating and developing. This is a very valuable trait to have as a data scientist because you can play an important role in your group of automating a lot of the work that needs to get done.
In order to do that, you will need to have a solid understanding of a scripting language.
This course is designed for both absolute beginners or people with some programming experience looking to learn Python which is one of the highest in-demand skill by employers in IT industry. The key point which makes this course unique is that it is fast yet detailed. This course provides sufficient details to you to design and develop your own Python solution. Unlike many other Python courses, This course is concise and you can complete it over a weekend.
Something like python, but not the kind we referred to earlier which was focused on libraries like Pandas and Scikit learn. Instead, we are referring to operational python. This usually requires a general understanding of data structures like hash-maps, loops, file system management, etc. This is where the course below comes in handy. It will provide a good basis for your programming skill set.
Let’s say you don’y want to pay for a course. That is fine. Again, these courses will never replace experience, they merely provide a good basis. Some people seem to need to pay for things in order to feel motivated to finish the course. But youtube has lots of great options. So We wanted to lay out a few options for free.
Now, we do want to provide a caveat, and it has been brought up by the ex-google tech lead Patrick on his youtube channel. Don’t spend all your time starting new tutorials. Do one or two, get the basics down and then start looking for projects. While you are doing those projects you should learn how to use resources like API documentation, and problem specific videos to improve your skills.
Otherwise, you will never really progress in your skill set. You will essentially just be spending your time relearning the abcs over and over again and never learning about words, sentences, paragraphs, essays, etc.
We will reference a Udemy course that covers similar topics later. However, if you are looking for a set of videos that teaches about object oriented programming and also data structures and algorithms then consider checking out Corey Schafer channel. He does a great job of having a pretty complete set of videos that pretty much covers everything you could learn from the first few pages of the python library online. This is great for learning how to build automated
Another great channel on youtube that will not just help you with tutorials but will also provide some more problem specific videos (vs general tutorials) is the CS Dojo. This channel covers everything from interviewing, coding, design, etc. It is a great channel if you really want to learn programming and want to get a job. Again, not all data scientists want to know python and programming, but some do. There is value in knowing it if that is the style of data scientist you want to be.
This is an in-depth hands-on tutorial introducing the viewer to Data Science with R programming. The video provides end-to-end data science training, including data exploration, data wrangling, data analysis, data visualization, feature engineering, and machine learning. All source code from videos are available from GitHub.
The great thing is David Lagner actually has more full tutorials on Data Science Dojo channel
In this webinar Dave Langer will provide an introduction to data visualization with the ggplot2 package. The focus of the webinar will be using ggplot2 to analyze your data visually with a specific focus on discovering the underlying signals/patterns of your business.
Attendees will learn how to:
Now some people might be interested in learning more about Kaggle, and how to be successful creating models that can compete on the platform. Well, good thing here is data science dojo has your back again. The video below is part one of a larger tutorial that does a great job walking through an example of a real problem.
This course has Hours of professional Tableau Video training, unique datasets designed with years of industry experience in mind, engaging exercises that are both fun and also give you a taste for Analytics of the REAL WORLD.
In this course you will learn:
We were first introduced to D3 by one of the co-creators of the library in our college classes. We had taken a bioinformatics course at UW and Jeffrey Heer came and demoed how it could be used.
D3.js is a powerful JavaScript library used to create data visualizations easily. In this course I’ll teach you how to harness the power of D3 to create a variety of different data-driven visualizations such as bar charts, pie charts, line graphs, bubble packs and tree diagrams.
We’ll learn about D3 select, changing SVG attributes & styles, scales, axes, transitions, hierarchical data and much more…
Now, this course is created more for data analysts and it doesn’t focus on python. Instead, it uses Excel. Many people feel like excel is not sufficient. Excel and R were the original versions of Jupyter Notebooks. It allowed analysts and statisticians to display their findings. That is why we still feel it very useful for some data scientists to look into.
Is statistics a driving force in the industry you want to enter? Do you want to work as a Marketing Analyst, a Business Intelligence Analyst, a Data Analyst, or a Data Scientist?
Well then, you’ve come to the right place!
Statistics for Data Science and Business Analysis is here for you with TEMPLATES in Excel included!
This is where you start. And it is the perfect beginning!
In no time, you will acquire the fundamental skills that will enable you to understand complicated statistical analysis directly applicable to real-life situations.
Now, there are plenty of other great courses you could take as a data scientist, data engineer or data analyst. But we just wanted to cover the ones we have taken. We have skipped over Hadoop for now and we are working on a post that will eventually cover all our favorite resources ont eh topic. Just know that Frank Kanes hadoop courses are great on Udemy and a good option.
Please let us know about your favorite courses in the comments and subscribe if you want to see more content like this.
In addition, if you would like to read more about data science, data engineering, business, etc. Please check out the articles below.
Are You Interested In Learning About Data Science Or Tech?
Learning Data Science: Our Favorite Data Science Books
What Is Data Science Really As Told By An Ex-FAANG Data Scientist
How Algorithms Can Become Unethical and Biased
How To Load Multiple Files With SQL
How To Develop Robust Algorithms
Dynamically Bulk Inserting CSV Data Into A SQL Server