Do you also have old pictures of yourself or close ones that didn’t age well or that you, or your parents, took before we could produce high-quality images? I do, and I felt like those memories were damaged forever. Boy, was I wrong!
This new and completely free AI model can fix most of your old pictures in a split second. It works well even with very low or high-quality inputs, which is typically quite the challenge.
This week’s paper called Towards Real-World Blind Face Restoration with Generative Facial Prior tackles the photo restoration task with outstanding results. What’s even cooler is that you can try it yourself and in your preferred way. They have open-sourced their code, created a demo and online applications for you to try right now. If the results you’ve seen above aren’t convincing enough, just watch the video and let me know what you think in the comments, I know it will blow your mind!
►Read the full article: https://www.louisbouchard.ai/gfp-gan/
►Wang, X., Li, Y., Zhang, H. and Shan, Y., 2021. Towards real-world
blind face restoration with generative facial prior. In Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp.
9168-9178), https://arxiv.org/pdf/2101.04061.pdf
►Code: https://github.com/TencentARC/GFPGAN
►Use it: https://app.baseten.co/applications/Q04Lz0d/operator_views/8qZG6Bg
►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/
0:00
do you also have old pictures of
0:01
yourself or close ones that didn't age
0:04
well or that you or your parents took
0:06
before we could produce high quality
0:08
images i do and i felt like those
0:10
memories were damaged forever boy was i
0:14
wrong this new and completely free ai
0:16
model can fix most of your old pictures
0:19
in a split second it works well even
0:21
with very low or high quality inputs
0:23
which is typically quite the challenge
0:26
this week's paper called two words real
0:28
world blind face restoration with
0:30
generative facial prior tackles the
0:33
photo restoration task with outstanding
0:35
results what's even cooler is that you
0:37
can try it yourself in your preferred
0:40
way they have open sourced their code
0:42
created a demo and online applications
0:45
for you to try right now if the results
0:47
you've been seeing aren't convincing
0:49
enough just wait until the end of the
0:51
video not only to support my work which
0:53
i'd be grateful for but also because i
0:56
believe it's important to understand how
0:58
it works and its limitations which is
1:00
really interesting as it gives us
1:01
insights into what they are working to
1:04
improve and the following results will
1:06
blow your mind but first allow me to
1:08
come back to last week's video since i
1:10
wanted to share something about this
1:12
episode sponsor weights and biases in
1:15
last week's video which you should
1:16
definitely watch if you haven't yet i
1:19
shared the top 5 articles of the month
1:21
and a great one covered tips from a
1:23
weights and biases user have you ever
1:25
left a long running ml experiment to
1:27
train and had a very sad moment when you
1:30
saw that the training had crashed well
1:32
then you'll love this new feature along
1:35
with tracking your experiments metrics
1:37
now with alerts weight and biases can
1:39
also proactively notify you when things
1:42
go wrong you can be notified via slack
1:44
or email if your training has crashed or
1:46
whether a custom trigger such as your
1:49
loss going to none or a step in your ml
1:51
pipeline has been reached it's really
1:53
easy to set up and you can get started
1:55
with weight and viruses alerts in two
1:57
quick steps turn on alerts in your
1:59
weights and biases user settings create
2:01
your custom triggers and add wnb alert
2:05
to your code wherever you'd like to be
2:07
alerted for a custom trigger and that's
2:09
it i have insights that many users
2:11
already have saved large cloud gpu bills
2:14
by being alerted early to crash runs
2:16
while training large expensive to train
2:19
models which is pretty cool try out
2:22
weights and biases alerts now using the
2:24
first link in the description below
2:29
i mentioned that the model worked well
2:31
on low quality images just look at the
2:34
results and level of detail compared to
2:36
the other approaches these results are
2:39
just incredible note that they do not
2:41
represent the actual image it's
2:43
important to understand that these
2:44
results are just guesses from the model
2:47
guesses that seem pretty damn close to
2:50
our eyes it seems like the same image
2:52
representing the person we couldn't
2:54
guess that the model created more pixels
2:56
without knowing anything else about the
2:59
person so the model tries its best to
3:01
understand what's in the picture fill in
3:04
the gaps or add pixels if the image is
3:06
of low resolution but how does it work
3:09
how can an ai model understand what is
3:12
in the picture and more impressing
3:15
understand what isn't in the picture
3:17
such as what was in the place of the
3:19
scratch well as you will see guns aren't
3:23
dead yet indeed the researchers didn't
3:25
create anything new they simply
3:27
maximized gann's performances by helping
3:29
the network as much as possible and what
3:32
could be better to help a gun
3:34
architecture than using another gun
3:37
their model is called gfp gun for a
3:40
reason gfp stands for generative facial
3:43
prior and i already covered what guns
3:45
are in multiple videos if it sounds like
3:48
another language to you for example a
3:50
model i covered last year for image up
3:52
sampling called pulse uses pre-trained
3:55
guns like style gun 2 from nvidia and
3:58
up-devices the encodings called latent
4:01
code during training to improve the
4:03
reconstruction quality again if this
4:05
doesn't ring any bell please take a few
4:08
minutes to watch the video i made
4:09
covering the pulse model however as they
4:12
state in the paper these methods
4:14
referring to pulse usually produce
4:17
images with low fidelity as the low
4:19
resolution latent codes are insufficient
4:21
to guide the restoration in contrast gfp
4:24
gun does not simply take a pre-trained
4:26
style gun and retrain it to orient the
4:29
encoded information for their task as
4:31
pulse does instead gfpgan uses a
4:34
pre-trained style gun 2 model to orient
4:37
their own generative model at multiple
4:39
scales during the encoding of the image
4:42
down to the latent code and up to
4:44
reconstruction you can see it here where
4:46
we merge the information from our
4:48
current model with the pre-trained gun
4:50
prior using their channel split sft
4:53
method you can find more information
4:55
about how exactly they merge information
4:57
from the two models in the paper linked
4:59
below the pre-trained style gun 2 is our
5:01
prior knowledge in this case as it
5:04
already knows how to process the image
5:06
but for a different task meaning that
5:08
they will help their image restoration
5:10
model better match the features at each
5:12
step using this prior information from a
5:15
powerful pre-trained style gun to model
5:18
known to create meaningful encodings and
5:20
generate accurate pictures this will
5:23
help the model achieve realistic results
5:25
while preserving high fidelity so
5:27
instead of simply orienting the training
5:29
based on the difference between the
5:31
generated fake image and the expected or
5:34
real image using our discriminator model
5:37
from the gun network we will also have
5:40
two metrics for preserving identity and
5:42
facial components these two added
5:45
metrics called losses will help enhance
5:48
facial details and as it says ensure
5:50
that we keep the person's identity or at
5:53
least we do our best to do so the facial
5:56
component loss is basically the same
5:58
thing as the discriminator adversarial
6:00
lust we find in classic guns but focuses
6:03
on important local features of the
6:05
resulting image like the eyes and mouth
6:09
the identity preserving loss uses a
6:12
pre-trained face recognition model to
6:14
capture the most important facial
6:16
features and compare them to the real
6:18
image to see if we still have the same
6:21
person in the generated image and voila
6:24
we get these fantastic image
6:26
reconstruction results using all this
6:28
information from the different classes
6:31
the results shown in this video were all
6:33
produced using the most recent version
6:35
of their model version 1.3 you can see
6:38
that they openly share the weaknesses of
6:40
their approach which is quite cool and
6:43
here i just wanted to come back on
6:44
something i mentioned before which is
6:47
the second weakness have slight change
6:50
on identity indeed this will happen and
6:53
there's nothing we can do about it we
6:55
can limit this shift but we can't be
6:57
sure the reconstructed picture will be
6:59
identical to the original one it is
7:02
simply impossible reconstructing the
7:04
same person from a low definition image
7:06
will mean that we know exactly what the
7:08
person looked like at that time which we
7:11
don't we base ourselves on our knowledge
7:13
of humans and how they typically look to
7:16
make guesses on the blurry picture and
7:18
create hundreds of new pixels the
7:21
resulting image will look just like our
7:23
grandfather if we are lucky enough but
7:25
it may as well look like a complete
7:27
stranger and you need to keep that in
7:29
consideration when you use these kinds
7:31
of models still the results are
7:34
fantastic and remarkably close to
7:36
reality i strongly invite you to play
7:38
with it and create your own id of the
7:40
model and results let me know what you
7:43
think and i hope you enjoyed the video
7:45
before you leave if you are interested
7:47
in ai ethics we will be sending the next
7:49
iteration of our newsletter with
7:51
martina's view on the ethical
7:53
considerations of such techniques in the
7:56
following days stay tuned for that
[Music]