Marine plastic pollution has been at the forefront of climate issues for the past decade. Not only does plastic in the ocean capable of killing marine life by strangulation or starvation but it is also a major factor in warming the ocean by trapping CO2. In recent years, there have been numerous attempts at cleaning the plastic that has been circling our oceans such as by the non-profit group, The Ocean Cleanup. The problem with a lot of cleanup processes is that it requires human labor and is not cost-efficient. There has been a lot of research done to automate this process by using Computer Vision and Deep Learning to detect marine debris to utilize ROVs and AUVs for cleanup.
Note: This is theoretical and we are working on publishing our paper. Once it gets peer-reviewed and gets published in a journal I will update this article. So, watch this space for the release of our academic paper!
Reading Resources: Our Paper on Using CV and DL to detect Epipelagic plastic, The University Of Minnesota's paper on a general-purpose marine debris detecting algorithm, and The Ocean Cleanup’s paper on floating plastic detection on rivers.
The major issue for this approach is regarding the availability of datasets to train the computer vision models. The JAMSTEC-JEDI dataset is a good collection of Marine Debris off the coast of Japan on the seafloor. But, apart from this dataset, there is a huge discrepancy in the availability of datasets. For this reason, I’ve utilized the help of Generative Adversarial Networks; The DCGAN in particular to curate synthetic datasets that could theoretically be a close replica to real datasets over time. (I wrote theoretically because there are a lot of variables such as quality of real-images available, progress with GANs, training, etc).
GANs or Generative Adversarial Networks were proposed by Ian Goodfellow et al in 2014. GANs consist of a simple 2 components called Generators and Discriminators. An oversimplification of the process is as follows: The Generators role is to generate new data and the Discriminators role is to distinguish between the generated data and the actual data. In the ideal scenario, the discriminator is unable to distinguish between the generated data and real data which results in ideal synthetic data points.
DCGAN is a direct extension of the GAN architecture mentioned above except it uses Deep Convolutional Layers in the discriminator and generator respectively. It was first described by Radford et. al in the paper Unsupervised Representation Learning With Deep Convolutional Generative Adversarial Networks. The discriminator is made up of strided convolutional layers while the generator is made up of convolutional transpose layers.
GAN architecture (source: Here)
In this method, we are going to be applying the DCGAN architecture on the DeepTrash dataset. If you’re not familiar with the DeepTrash dataset, consider reading my paper A Robotic Approach towards Quantifying Epipelagic Bound Plastic Using Deep Visual Models. DeepTrash is a collection of plastic images in the epipelagic layer and abyssopelagic layer of the ocean curated for marine plastic detection using computer vision.
Let's begin coding!
We start by installing all the basic requirements to building our GAN model such as Matplotlib and Numpy. We will also be utilizing all the tools (such as neural networks, transformations) from Pytorch. If you’re not familiar with PyTorch I recommend giving this article a read:
This step is fairly straightforward. We’re going to set up the hyperparameters that we want to train the Neural Network with. These hyperparameters are borrowed straight from the paper and PyTorch’s tutorial on training GANs.
Now, we define the architectures for the generator and discriminator.
After defining the Generator and Discriminator classes, we move on to defining the training function. The training function takes in the generator, discriminator, optimization functions, and the number of epochs as the arguments. We train the generator and discriminator by recursively calling the train function until the required number of epochs. We do this by iterating through the dataloader and updating the discriminator with new images from the generator and calculating and updating the loss function.
After we’ve established the Generator, Discriminator, and Train functions the final step is to simply call the train function for the number of epochs we defined. I’ve also used Wandb which allows us to monitor our training.
We plot the loss incurred by the generator and discrimination during training.
IMG - Generator and Discriminator Loss during Training
We can also pull up an animation of how the generator was generating images to see a difference between the real and the fake images.
Which looks something like this:
And the final resulting image after training (the same image as above).
In this article, we discussed the use of Deep Convolution Generative Adversarial Networks to generate synthetic images of marine plastic that can be used by researchers to extend their current dataset of marine plastic. This can help with giving researchers the ability to extend their dataset with a mix of real and synthetic images. As you can see with the results, the GAN still needs a lot of work. The ocean is a complex environment with varying illumination, turbidity, blur, etc. But this is a theoretical starting point that other researchers can build off of. If you’re interested in improving the results and building a better network architecture, please reach out to me at [email protected].