20 Best Machine Learning Resources for Data Scientists

Author profile picture

@limarc2000Limarc Ambalina

Limarc is a Tokyo-based writer of all things pop culture, travel, and tech.

Whether you’re a beginner looking for introductory articles or an intermediate looking for datasets or papers about new AI models, this list of machine learning resources has something for everyone interested in or working in data science. In this article, we will introduce guides, papers, tools and datasets for both computer vision and natural language processing. 

Machine Learning Resources for Computer Vision

Data scientists working in computer vision are developing machines that can see the world and process visual data similar to the way the human mind processes visual data. Without the developments and breakthroughs in computer vision, self-driving cars, facial recognition, and virtual reality headsets wouldn’t be possible today. 
Below are just a few of the best computer vision articles, papers, tools, and datasets for beginners, intermediates, and experts in the field. 

Computer Vision Articles

1. Computer Vision by Andrew Ng — 11 Lessons Learned - In this article, Ryan Shrott goes over 11 interesting and important insights from Andrew Ng’s popular course on computer vision. 
2. How to do Everything in Computer Vision - An intermediate introduction to image classification, object detection, segmentation, pose estimation, and more. 
3. What is Image Annotation? - A short guide to five common types of image annotation and their use cases. 
4. How to Use Data Augmentation to Increase your Image Datasets - From the founder of AI Summer, this article is a guide on how to get more training data using data augmentation.

Academic Computer Vision Papers

5. Kornia - From researchers at the Cezch Technical University at Prague and Open CV, this paper introduces Kornia, an open source computer vision library for PyTorch. 
6. Mask R-CNN - In this paper, researchers at Facebook AI present Mask R-CNN, a framework for object image segmentation. 
7. Intro to CNN Keras - While not published in an official academic journal, the Intro to CNN Keras is one of the most popular notebooks on Kaggle. The notebook details a step-by-step guide on how to train a convolutional neural network for digit recognition. 

Computer Vision Tools

8. CVAT - CVAT stands for Computer Vision Annotation Tool, and is an online platform for labelling images and videos. 
9. VGG Image Annotator - An open source image annotation tool that supports bounding boxes, polygons, circles, ellipticals, keypoints, and polylines. 

Datasets for Computer Vision

10. Open Images Dataset - From Google, the Open Images Dataset is one of the largest publicly available image datasets in the world. It includes millions of images with accompanying annotations.
11. COCO - The COCO dataset includes over 333,000 images and with around 183,000 of those images labeled. Within the images, 1.5 million objects have been annotated. 

Machine Learning Resources for NLP

Natural language processing (NLP) is the field of machine learning that seeks to give computers the ability to understand written and spoken languages. It is thanks to developments in NLP that we have virtual assistants, smart home devices, voice search engines, and other amazing technologies. 
Below are just a few of the best NLP articles, papers, tools, and datasets.

NLP Articles

12. Your Guide to NLP - A beginner’s guide to understanding the basic concepts of natural language processing, use cases, and essential NLP terms.
13. A Practitioner's Guide to Natural Language Processing - An in-depth guide to approaching NLP projects from data collection and data annotation to standard NLP workflows and the future of the field. 

Academic NLP Papers

14. Natural Language Processing (almost) from Scratch - From researchers at Google and NEC Labs, this paper introduces a unified neural network architecture that can be applied to a variety of NLP tasks. 
15. Huggingface's Transformers: State-of-the-art Natural Language Processing - From NLP startup Huggingface, this paper introduces Transformers, a library for NLP with transfer learning models. 

Tools for Natural Language Processing

16. 5 Heroic Tools for Natural Language Processing - A list of 5 open-source tools and libraries for various NLP tasks.
17. Text Annotation Tools - A list of 10 leading tools and services for text data annotation, sentiment classification, and more.
18. Named Entity Recognition Tools - A list of the best tools for named entity recognition and entity linking. 

NLP Datasets

19. Great Open Datasets for Your First NLP Project - A list of 10 datasets for sentiment analysis, question & answer analysis, speech recognition, and more.
20. Text and Audio Datasets for Natural Language Processing - A list of 25 datasets for text classification, spam detection, audio transcription, and more.
We hope one of the machine learning resources on this list helped you learn something new, or helped contribute to your machine learning projects. New interesting ML papers and open-source tools are constantly being released. Please follow me on Hackernoon for further updates. 


The Noonification banner

Subscribe to get your daily round-up of top tech stories!