Hello, Machine Learning community!
We are proud to announce Supervisely Person Dataset_._ Itâs publicly available and free for academic purposes.
For AI to be free we need not just Open Source, but also a strong Open Data movement.
â Andrew Ng
We absolutely agree with him. And let us extend this idea. There are a lot of research on Deep Neural Networks for semantic segmentation task. But in most cases data is much harder and expensive to collect than developing and applying the algorithms to run on it.
That is why we also need specially designed platforms to cover entire ML workflow from developing training datasets to training and deploying neural networks.
Few examples from âSupervisely Person Datasetâ
We believe that our work will help developers, researchers and businesses and be perceived not only as yet another public dataset but also as a set of innovative approaches and instruments for creating large training datasets faster.
Next, we are going to cover all aspects of how we built this dataset from scratch. Before we continue, let me show you some interesting facts:
Supervisely is Machine Learning platform which include data science smarts. It allows data scientists to focus on real innovations and leave routine work to others (yes, training of well known NN architectures is a routine work too).
Person segmentation is critical task in analysing humans on images for many real-world applications: action recognition, self-driving cars, video surveillance, mobile applications and much more.
We at DeepSystems had our internal research on this field and we realized that there is a lack of data for this task. You can ask us: what about public datasets like COCO, Pascal, Mapillary and others? To answer this question iâll better show you few examples:
Few examples of human annotation from COCOÂ dataset
The quality of human segmentation in most public datasets is not satisfied our requirements and we had to create our own dataset with high quality annotations. I will show you how we did it below.
Upload public datasets to the system: PascalVoc, Mapillary. Our âImportâ module supports most of public datasets and converts them to unified json-based format called Supervisely format :)
Them we execute the DTL (âData Transformation Languageâ) query to perform few things: merge datasets -> skip images without person objects -> crop each person from images -> filter them by width and height -> split to train/test sets.
It seems like there are a lot of publicly available data but we mentioned earlier, that there are some hidden problems: low quality of annotations, low resolution and so on.
Thus, we construct our first training dataset.
We will train slightly customizes UNet-like architecture.
Unet_v2 architecture
loss = BinaryCrossEntropy + (1âââdice).
This network is fast to train, it is pretty accurate and easy to implement and customize. It allows us to experiment a lot. Supervisely can be distributed across multiple nodes in cluster.
Thus we can train few NNs simultaneously. Also all NNs support multi-GPU training in our platform. Each training experiment with input resolution 256*256 took no more than 15 minutes.
We didnât have the collection of unlabeled images, so we decided to download it from the Web. We implemented service (github) that downloads data from great photo stockâââPexels (thank you guys for really cool work).
So, we downloaded around 15k images with tags related to our task, upload them to Supervisely and perform resize operation via DTL query because they had super resolution.
Used architecture do not support instance segmentation. We deliberately didnât use Mask-RCNN, because the quality of segmentation near object edges is low.
Thatâs why we decided to make a two-steps scheme: apply Faster-RCNN (based on NasNet) to detect all persons on images, and then for each person bounding box apply segmentation network to segment dominating object. This approach allows us both to simulate instance segmentation and to segment object edges accurately.
3-min video of applying model and manual correction of segmentation
We experimented with different resolutions: the more resolution we pass to NN, the better result it produces. We didnât care about the total inference time, because Supervisely supports inference that is distributed across multiple machines. For the task of automatic pre-annotation it is more than enough.
All inference results appear in dashboard in real time. Our operators preview all results and label images with a few tags: bad prediction, prediction to correct, good prediction. This process is fast because they need few keyboard shortcuts for ânext imageâ and âassign tag to imageâ.
How we tag images: leftâââbad prediction, mediumâââprediction that needed light manual correction, rightâââgood prediction.
Images tagged as âbad predictionâ are skipped. Further work continues with the images we need to correct.
How to correct Neural Network predictions
Manual correction requires significantly less time than annotation from scratch.
Thatâs all.
As you can see, such approach is applicable to many computer vision tasks even if you need to annotate several object classes on images.
This dataset helps us to improve AI powered annotation tool âcustomize it to segment humans. We have added the ability to train NN for this tool inside system in our latest release. Here is the comparison of class-agnostic based tool and its customized version. It is available and you can try it on your data.
Sign up for Supervisely, go to âImportâ tab -> âDatasets libraryâ. Click to âSupervisely Personâ dataset, write name for new project. Then click âthree dotsâ button -> âDownload as jsonâ-> âStartâ button. Thatâs all. Total download time may take 15 minutes (~ 7 GB).
How to download
It was very interesting to look at how people without any ML background went through all this steps. We as Deep Learning specialists saved a lot of time and our annotation team became more productive in terms of annotation speed and quality.
We hope, that Supervisely platform will help every deep learning team to make AI products faster and easier.
Let me list most valuable Supervisely features we use in this work:
Feel free to ask any questions! Thank you!
If you found this article interesting, then letâs help others too. More people will see it if you give it some đ.