hosted on Kaggle have attracted a lot of attention from the Deep Learning community. Currently, the contest has more than 600 teams registered. The task is to build a model that segments the car out of the scene background. Carvana Image Masking Challenge Original and target images Conceptually, the task seems to be well defined and simple, especially in comparison, say, with full recognition of the road scene for safe self driving. Indeed, 396 teams have achieved the score above 0.99. All further fight will be for 3–4 decimal place in the final score. In general, the Kaggle community is extremely creative and very non-trivial solutions are born as a result of tough competition. For instance, take a look at the of . winner solution Taxi Prediction Challenge However, when it comes to semantic segmentation problems, non-trivial approaches are difficult to utilize. Ensembles of Unet like architectures trained at different resolutions will prevail in most top-scored solutions. In situations where very similar approaches compete with each other, the Chance plays a huge role. And the following question arises: “Is there any other way to get a competitive advantage?” We think that the answer is yes, especially if we look at the task from different perspective: . attack the data rather than model In this post we will describe the way managed to generate Synthetic Carvana Images (plus Ground Truth) that are very similar to the real ones — the training data provided by challenge organizers. What’s even more important is that our synthetic training set are freely available and everyone may make use of it to obtain higher score in the challenge. we Check out those two cars above. One of them is real, and one is synthetically generated using GTA V. Which is which? Let’s play GTA V for science! Modern computer games have interesting connection with Deep Learning. They are engaging and, more importantly, look realistic. Take GTA V, for instance. , over the years, put huge amount of efforts to make the gameplay as close to reality as possible. So, potentially, one may consider the game as infinity training set with all possible and impossible road scene configurations. Rockstar North Here we narrow the general approach mentioned above. We use GTA V to obtain the car images and segmentation masks under different camera views. The idea is not new, for instance, dataset from 2016. Playing for Data Unfortunately, there is no straight forward way to do it (no such kind of API for GTA V available). But conceptually it’s possible. So a bit of reverse engineering could help us. We will not focus much on the reverse engineering procedure itself (if someone is interested, let us know in the comments), rather will describe the process in general (and show a lot of cool pictures). Some intermediate magic in action After we’ve successfully injected our DLL in GTA process, we programmable place every vehicle available in GTA into garage. Well, not every — after some filtering we kept only 154 models that make sense for Carvana challenge, because airship does not. Then, we rotate our model per 10° with several different camera angles. Finally, we change car color: we chose black and white. Okay, now we can take nice screenshots like the one above, but there are no ground truth available. That’s bad. Luckily, we can hook into DirectX API calls and make some manipulations with objects on scene. After a few broken keyboards we found a way to highlight the car: As you can see, there is no windows. It’s because windows are totally separate objects in GTA V. So, we also highlight only windows: Now that’s something! We actually got both ground truth mask and a car image. But we also need to extract and place our model on Carvana scene and make the final result as close to reality, as possible. Because of that, we also want to extract a car shadow from GTA: As you can see, we’ve failed to make the floor exactly white and plain. But don’t worry: Photoshop is here to help us! Photoshop for Deep Learning What kind of people use Photoshop for machine learning? Well, we do. Actually, Photoshop has a lot to offer. But most people don’t know it’s possible to use good ol’ JavaScript to automate every action. That’s what we did. We start with screenshot from the game: First, the easy one: we combine car and windows ground truth to obtain the final mask: Now, we can cut out the car from screenshot and place it on empty stage we made before: As you can see, the car is too dark. That’s because it was shot in darker place. Luckily, Photoshop has Auto Tone and Auto Color: Much better! But the car is floating in the air. That’s because there is no shadow. It is possible to generate shadows in Photoshop, but it’s hard because we need to keep model rotation angle in mind. So, we will take shadow directly from GTA. We load screenshot with white (kinda) floor and make some manipulations: But there are still no windows! Let’s fix that by generating windows using some gradients: And finally, enlarge car to fit the scene: All those manipulations are done programmable using Photoshop JS scripting and pre-recorded actions. If you think this is an interesting topic for a tutorial, please leave your opinion in comments. How to get Synthetic Carvana Dataset We have made this dataset publicly available in our training data platform . Check out this if you want to know more about it. Follow those simple steps to get data: Supervise.ly post on medium 1. Create an account on Supervise.ly. It’s free and takes just a minute. Signup 2. Choose dataset from library Open → library and click on “CarvanaGTA5" dataset. Enter a project name (for example, “Carvana”), click and . After import task completion you will see your new dataset on page. Import Datasets Next Upload Projects Datasets library You can check out images in by clicking on dataset or look at statistics. Annotation tool Annotation tool 3. Export data Now you can download dataset on your computer by using . Export is a powerful feature of Supervise.ly that uses JSON-configurations to make filtering, resizing, augmentation, train-validation splitting, combining multiple datasets in one — and then save your results in popular ready to train frameworks formats. Export tool Go to the page and paste the following config in editor: Export [{"action": "data","src": ["<Your project name>/*"],"dst": "$sample","settings": {"classes_mapping": "default"}},{"action": "tag","src": ["$sample"],"dst": "$sample2","settings": {"tag": "train"}},{"action": "background","src": ["$sample2"],"dst": "$sample3","settings": {"class": "bg"}},{"action": "segmentation","src": ["$sample3"],"dst": "Carvana","settings": {"gt_machine_color": {"car": [255, 255, 255],"bg": [0, 0, 0]},"tag2part": {"train": "train"},"txt_generation": {"prefix": "."}}}] Here we define an array of sequential transformations of data: we tag every image as “train”, pass it to layer to generate class and finally use layer to make ground truth images. You can read more about Export in . background bg segmentation documentation Now click button and enter some name (optional). Start Exporting Supervise.ly will prepare your archive and after some time button would appear in Tasks: Download Done! If you have some time, check out our tutorials on Supervise.ly like — it has a lot to offer. Number plate detection Instead of conclusion, couple of general words We live in an era of unprecedented democratization of Deep Learning technologies — the academic community and business openly publish research and frameworks for building neural networks. However, when it comes to training data the situation is very different. In terms of data availability, industry giants (google, facebook, amazon) have huge advantage over other companies. The following chart from Andrew Ng is a very illustrative: Or in words: the quality of intellectual products based on Deep Learning is determined by the amount of available training data. Increasing the training data availability is the main priority of our company. We approach the problem from two sides: Development of tools for manual annotation. Recently we opened free access to the test version of Supervise.ly Research in generation of synthetic / semi-synthetic scenes, experiments with the tools that automate the annotation process. Synthetic Carvana dataset we released today is the part of our research. Please, let us know whether or not our synthetic dataset help you to achieve higher scores.

Airship

Amazon

Community Is

Facebook

Garage

Google

Target

🔥 Latest Deep Learning OCR with Keras and Supervisely in 15 minutes

Docker Compose + GPU + TensorFlow = ❤️

Too Long; Didn't Read

Hacking GTA V for Carvana Kaggle Challenge

Hacking GTA V for Carvana Kaggle Challenge

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Can you solve a person detection task in 10 minutes?

10 Best PSP Games of All Time Ranked by Sales

6 Best Grand Theft Auto Mods: Fix The Definitive Edition

All GTA Games in Chronological Order: From Grand Theft Auto: London 1961 to GTA Online

Everything We Know About GTA VI So Far

What Do You Do When a Fan Infringes on Your Copyright?

Can you solve a person detection task in 10 minutes?

10 Best PSP Games of All Time Ranked by Sales

6 Best Grand Theft Auto Mods: Fix The Definitive Edition

All GTA Games in Chronological Order: From Grand Theft Auto: London 1961 to GTA Online

Everything We Know About GTA VI So Far

What Do You Do When a Fan Infringes on Your Copyright?

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps