paint-brush
5 Million Face Images for Facial Recognition Model Trainingby@limarc
1,814 reads
1,814 reads

5 Million Face Images for Facial Recognition Model Training

by Limarc AmbalinaNovember 10th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Free image datasets for face recognition have over 5,000,000 face images and video frames. From GIFs and still images taken from Youtube videos to thermal imaging and 3D images, each dataset is different and suited to different projects and algorithms. From real and fake face images to annotated face images, the datasets on this list vary in size and scope. We’ve compiled a list of the best free image datasets. Each dataset is strictly for non-commercial research purposes only.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - 5 Million Face Images for Facial Recognition Model Training
Limarc Ambalina HackerNoon profile picture

This article on face recognition datasets is one of my best-performing articles I wrote originally on Lionbridge AI. I'm happy to share it with the Hacker Noon community!

From mobile phone security and surveillance cameras to augmented reality and photography, the facial recognition branch of computer vision has a variety of useful applications. Depending on your specific project, you may require face images in different lighting conditions, faces that express different emotions, or annotated face images. From video frames annotated with facial keypoints to real and fake face image pairs, the datasets on this list vary in size and scope.

Where can I find Free Image Datasets for Facial Recognition Models?

We’ve compiled a list of the best free image datasets for face recognition which total over 5,000,000 face images and video frames. Ranging from GIFs and still images taken from Youtube videos to thermal imaging and 3D images, each dataset is different and suited to different projects and algorithms.

1. CelebA Dataset

For non-commercial research purposes only, this dataset from MMLAB contains over 200,000 celebrity images.

2. Face Detection in Images with Bounding Boxes

A simple, yet useful dataset, Face Detection in Images contains just over 500 images with approximately 1,100 faces already tagged with bounding boxes.

dataturks.com/projects/devika.mishra/face_detection3
Face Images with Marked Landmark Points

This dataset includes over 7,000 facial images with keypoints annotated on every image. The number of keypoints on each image varies, with the max number of keypoints being 15 on a single image. The keypoints data is included in a separate CSV file.

4. Flickr Faces

With images taken from Flickr, this dataset has 210,000 images. The total image count is made up of 70,000 original images from Flickr, 70,000 images cropped at 1024 x 1024 pixels, and 70,000 cropped at 128 x 128 pixels.

5. Google Facial Expression Comparison

From Google AI comes the Google Facial Expression Comparison dataset which includes 156,000 facial images. The images come in triplets, with two images out of each triplet annotated as the “most similar” in the triplet in terms of facial expression. In true Google fashion, these images were meticulously annotated and each triplet was worked on by at least six separate human annotators.

6. Labeled Faces in the Wild

Created by researchers at the University of Massachusetts, this dataset was originally made to study unconstrained face recognition. It totals over 13,000 images of over 5,700 people. The dataset also includes helpful metadata in CSV format.

7. Real and Fake Face Detection

This dataset was made to train facial recognition models to distinguish real face images from generated face images. The dataset includes over 1,000 real face images and over 900 fake face images which vary from easy, mid, and hard recognition difficulty.

8. Simpsons Faces

With images taken from seasons 25 to 28 of the popular American cartoon series, this dataset includes over 9,800 cropped faces of Simpsons characters.

9. Tufts Face Database

With over 100,000 images, the Tufts Face Database includes a huge collection of facial images divided into nine categories. The categories include computerized sketches, thermal, thermal cropped, three dimensional, Lytro, 2D RGB around, 2D RGB emotion, night vision, and video.

10. UMDFaces

By far the largest dataset on this list, the UMDFaces dataset has over 367,000 face annotations across over 8,200 different subjects in still images. Apart from those images, the dataset also includes over 3.7 million video frames all annotated with facial keypoints of over 3,100 subjects. It should be noted that this dataset is strictly for non-commercial research purposes only.

via umdfaces.io11. UTKFace

The UTKFace dataset includes faces from a wide age range. The people in these images range from less than a year old to over 100 years old. The dataset includes over 20,000 face images with age, gender, and ethnicity annotations.

12. Wider Face

This dataset contains over 10,000 images that include multiple people or just a single person. The images are divided into numerous settings such as meetings, traffic, parades, and more.

13. Yale Face Database

The Yale Face Database is a dataset containing 165 GIF images of 15 different subjects in a variety of lighting conditions. The subjects in the images display different emotions and expressions.

14. Youtube Faces with Facial Keypoints

This dataset is composed of public Youtube videos of celebrities which total 155,560 still frames. The videos have been cropped around the faces of the celebrities and have been annotated with facial keypoints for each frame of every video.

Also published on: https://lionbridge.ai/datasets/5-million-faces-top-15-free-image-datasets-for-facial-recognition/

Still looking for more datasets? Check out:

  1. https://hackernoon.com/tagged/datasets
  2. https://hackernoon.com/tagged/dataset