BlobGAN allows for unreal manipulation of images, made super easily controlling simple blobs. All these small blobs represent an object, and you can move them around or make them bigger, smaller, or even remove them, and it will have the same effect on the object it represents in the image. This is so cool!
As the authors shared in their results, you can even create novel images by duplicating blobs, creating unseen images in the dataset ! Correct me if I’m wrong, but I believe it is one of, if not the first, paper to make the modification of images as simple as moving blobs around and allowing for edits that were unseen in the training dataset.
And you can actually play with this one compared to some companies we all know! They shared their code publicly and a Colab Demo you can try right away. Even more exciting is how BlobGAN works. Learn more in the video!
►Read the full article: https://www.louisbouchard.ai/blobgan/
►Epstein, D., Park, T., Zhang, R., Shechtman, E. and Efros, A.A., 2022.
BlobGAN: Spatially Disentangled Scene Representations. arXiv preprint
arXiv:2205.02837.
►Project link: https://dave.ml/blobgan/
►Code: https://github.com/dave-epstein/blobgan
►Colab Demo: https://colab.research.google.com/drive/1clvh28Yds5CvKsYYENGLS3iIIrlZK4xO?usp=sharing#scrollTo=0QuVIyVplOKu
►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/
0:00
if you think that the progress with guns
0:02
was over you couldn't be more wrong
0:04
here's blob gun and this new paper is
0:07
just incredible blob gun allows for
0:09
unreal manipulation of images made super
0:12
easily controlling simple blobs all
0:14
these small blobs represent an object
0:17
and you can move them around make them
0:19
bigger smaller or even remove them and
0:22
it will have the same effect on the
0:24
object it represents in the image this
0:26
is so cool as the authors shared in
0:29
their results you can even create novel
0:31
images by duplicating blubs creating
0:34
unseen images in the data set like this
0:37
room with two ceiling fans correct me if
0:40
i'm wrong but i believe it's one of if
0:42
not the first paper to make the
0:44
modification of images as simple as
0:46
moving blobs around and allowing for
0:49
edits that were unseen in the training
0:51
dataset and you can actually play with
0:53
this one compared to other companies we
0:55
all know they shared are called publicly
0:58
and a collab demo you can try right away
1:00
even more exciting is how bloggian works
1:03
which we'll dive into in a few seconds
1:05
to publish an excellent paper like
1:07
blobgun the researchers needed to run
1:09
many experiments on multiple machines
1:12
those who played with guns know how long
1:14
and painful this process can be plus
1:16
their code is available on github and
1:18
google collab this means their code has
1:21
to be reproducible funnily enough this
1:24
is also a really strong point of this
1:26
episode's sponsor weights and biases
1:28
weights and biases change my life as a
1:30
researcher it tracks everything you need
1:32
for your code to be reproducible the
1:34
hyper parameters the github commit
1:36
hardware usage metrics and the python
1:38
version leaving you without headaches ok
1:41
some might still appear because of
1:43
deadlines or bugs but none from trying
1:45
to reproduce experiments weights and
1:47
biases is also super helpful when
1:49
sharing your experiment results with
1:51
your colleagues a great tool for that is
1:53
reports they can act as dashboards for
1:56
supervisor pis or managers to check how
1:59
experimentation is going meaning more
2:01
time for research while improving your
2:03
feedback's quality please don't be like
2:06
most researchers who keep their code a
2:08
secret and try using weights and biases
2:10
with the first link below
2:13
now let's get back to our paper blub gun
2:16
spatially disentangled scene
2:18
representations the title says it ital
2:21
blovkian uses blobs to disentangle
2:23
objects in a scene meaning that the
2:25
model learns to associate each blob with
2:28
a specific object in the scene like a
2:30
bed window or ceiling fan once trained
2:33
you can move the blobs and objects
2:35
around individually make them bigger or
2:37
smaller duplicate them or even remove
2:40
them from the picture of course the
2:42
results are not entirely realistic but
2:44
as a great person would say just imagine
2:47
the potential of this approach two more
2:49
papers down the line
2:51
what's even cooler is that this training
2:53
occurs in an unsupervised scheme this
2:55
means that you do not need every single
2:57
image example to train it as you would
3:00
in supervised learning a quick example
3:02
is that supervised training will require
3:05
you to have all the desired
3:06
manipulations in your image that are set
3:08
to teach blobs to learn those
3:10
transformations whereas in unsupervised
3:13
learning you do not need this extensive
3:15
data and the model will learn to achieve
3:17
this task by itself associating bluffs
3:20
to objects on its own without explicit
3:22
labels we train the model with a
3:24
generator and a discriminator in a gun
3:27
fashion i will simply do a quick
3:28
overview as i've covered guns in
3:30
numerous videos before as always in guns
3:33
the discriminator's responsibility is to
3:35
train the generator to create realistic
3:38
images the most important part of the
3:40
architecture is the generator with our
3:42
blobs and a style gun 2 like decoder i
3:45
also covered style gun based generators
3:48
in other videos if you are curious about
3:50
how it works but in short we first
3:52
create our blobs this is done by taking
3:55
random noise as in most generator
3:57
networks and mapping it into blobs using
4:00
a first neural network this will be
4:02
learned during training then you need to
4:05
do the impossible take this blob
4:07
representation and create a real image
4:10
out of it this is where the gan magic
4:12
happens since you are still listening
4:14
please consider subscribing to the
4:16
channel and liking the video it means a
4:18
lot and supports my work for free also
4:21
we have a community called learn ai
4:23
together on discord to learn exchange
4:26
with fellow ai enthusiasts i'm convinced
4:28
you'll love it there and i will be glad
4:30
to meet you
4:32
we need a star gun like architecture to
4:34
create our images from these blobs of
4:37
course we added the architecture to take
4:39
the blobs we just created as inputs
4:41
instead of the usual random noise
4:43
then we turn our model using the
4:45
discriminator to learn to generate
4:47
realistic images once we have good
4:50
enough results it means our model can
4:52
take on blob representation instead of
4:54
noise and generate images but we still
4:57
have a problem how can we disentangle
4:59
those blobs and make them match objects
5:02
well this is the beauty of our
5:04
unsupervised approach the model will
5:06
iteratively improve and create realistic
5:08
results while also learning how to
5:11
represent these images in the form of a
5:13
fixed number of blobs you can see here
5:15
how blubs are often used to represent
5:17
the same objects or very similar objects
5:20
in the scene here you can also see how
5:22
the same gloves are used to represent
5:24
either a window or a painting which
5:26
makes a lot of sense likewise you can
5:29
see that light is almost always
5:31
represented in the fort blub similarly
5:33
you can see how blubs are often
5:35
representing the same regions in the
5:37
scene most certainly leads you to the
5:39
similarities of images in the dataset
5:42
used for this experiment and voila this
5:45
is how blobgan learns to manipulate
5:47
scenes using a very intuitive blob
5:50
representation i'm excited to see the
5:52
realism of the results improve keeping a
5:54
similar approach using such a technique
5:57
we could design simple interactive apps
5:59
to allow designers and anyone to
6:01
manipulate images easily which is quite
6:04
exciting of course this was just an
6:06
overview of this new paper and i
6:08
strongly recommend reading their paper
6:10
for a better understanding and a lot
6:12
more detail on their approach
6:13
implementation and tests they did as i
6:16
said earlier in the video they also
6:18
shared their code publicly and a color
6:20
demo you can try right away all the
6:22
links are in the description below
6:24
thank you for watching until the end and
6:27
i will see you next week with another
6:28
amazing paper
[Music]