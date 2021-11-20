Site Color
I explain Artificial Intelligence terms and news to non-experts.
Have you ever had an image you really liked and could only manage to find a small version of it that looked like this image above on the left? How cool would it be if you could take this image and make it twice look as good? It’s great, but what if you could make it even four or eight times more high definition? Now we’re talking, just look at that.
Here we enhanced the resolution of the image by a factor of four, meaning that we have four times more height and width pixels for more details, making it look a lot smoother. The best thing is that this is done within a few seconds, completely automatically, and works with pretty much any image. Oh, and you can even use it yourself with a demo they made available... Watch more results and learn about how it works in the video!
►Read the full article: https://www.louisbouchard.ai/swinir/
►Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L. and Timofte, R.,
2021. SwinIR: Image restoration using swin transformer. In Proceedings
of the IEEE/CVF International Conference on Computer Vision (pp.
1833-1844).
►Code: https://github.com/JingyunLiang/SwinIR
►Demo: https://replicate.ai/jingyunliang/swinir
►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/
have you ever had an image and really
liked it but couldn't manage to find a
better version than this how cool would
it be if you could take this image and
make it look twice as good it would be
great but what if i could make it even
four or eight times more high definition
now we are talking just look at that
here we enhance the resolution of the
image by a factor of 4 meaning that we
have 4 times more height and width
pixels for more details making it look a
lot smoother the best thing is that this
is done within a few seconds completely
automatically and works with pretty much
any image oh and you can even use it
yourself with a demo they made available
as we will see during the video
speaking of enhancing resolution i'm
always looking to enhance different
aspects of how i work and share what i
make if you are working on machine
learning problems there is no better way
to enhance your workflows than with this
episode sponsor weights and biases waste
and biases is a ml ups platform where
you can keep track of your machine
learning experiments insights and ids a
feature i especially love is how you can
quickly create and share amazing looking
interactive reports like this one
clearly showing your team or future self
your runs matrix hyper parameter and
data configurations alongside any notes
you had at the time capturing and
sharing your work is essential if you
want to grow as a ml practitioner which
is why i highly recommend using tools
that improve your work like weights and
biases just try it with the first link
below and i will owe you an apology if
you haven't been promoted within a year
before getting into this amazing model
we have to first introduce the concept
of photo of sampling or image super
resolution the goal here is to construct
a high resolution image from a
corresponding low resolution input image
which is a face in this case but it can
be any object animal or landscape the
low resolution will be such as 512
pixels or smaller not that blurry but
it's clearly not high definition when
you have it full screen just take a
second to put the video on full screen
and you'll see the artifacts while we
are at it you should also take a few
more seconds to like the video and send
it to a friend or two i'm convinced they
will love this and will thank you for it
anyway we take the slow definition image
and transform it into a high definition
image with a much clearer face in this
case a 2048 pixel square image which is
4 times more hd to achieve that we
usually have a typical unit like
architecture with convolutional neural
networks which i covered in many videos
before like the one appearing on the top
right corner of your screen if you'd
like to learn more about how they work
the main downside is that cnns have
difficulty adapting to extremely broad
data sets since they have the same
kernels for all images which makes them
great for local results and
generalization but less powerful for the
overall results when we want the best
results for each individual image on the
other hand transformers are promising
architecture due to the self-attention
mechanism capturing global interactions
between contexts for each image but have
heavy computations that are not suitable
for images here instead of using cnn's
or transformers they created the same
unit-like architecture with both
convolution and attention mechanisms or
more precisely using the swin
transformer architecture the swin
transformer is amazing since it has both
the advantages of the cnns to process
images of larger sizes and prepare them
for the attention mechanisms and these
attention mechanisms will create
long-range connections so that the model
understands the overall image much
better and can also recreate the same
image in a better way i won't enter into
the details of the swin transformer as i
already covered this architecture a few
months ago and explain its difference
with cnns and classical transformer
architectures used in natural language
processing if you'd like to learn more
about it and how the researchers applied
transformers to vision check out the
video and come back for the explanation
of the subsampling model the model is
called swin ir and can do many tasks
which include image of sampling as i
said it uses convolutions to allow for
bigger images more precisely they use a
convolutional layer to reduce the size
of the image which you can see here this
reduced image is then sent into the
model and also passed directly to the
reconstruction module to give the model
general information about the image as
we will see in a few seconds this
representation will basically look like
many weird blurry versions of the image
giving valuable information to the
upscaling module and how the overall
image should look like then we see the
swing transformer layers coupled with
convolutions this is to compress the
image further and always extract more
valuable precise information about both
the style and details while forgetting
about the overall image this is why we
then add the convoluted image to give
the overall information we lack with a
skip connection all at this is finally
sent into a reconstruction module called
subpixel which looks like this and uses
both the larger general features and
smaller detailed features we just
created to reconstruct a higher
definition image you can see this as a
convolutional neural network but in
reverse or simply a decoder taking the
condensed features we have and
reconstructing a bigger image from it
again if you'd like to learn more about
cnns and decoders you should check some
of the videos i made covering them so
you basically send your image in a
convolutional layer take this new
representation save it for later while
also sending it in the swin transformer
architecture to condense the information
further and learn the most important
features to reconstruct then you take
these new features with the saved ones
and use a decoder to reconstruct the
high definition version and voila now
you only need enough data and you will
have results like this
of course as with all research there are
some limitations in this case probably
due to the initial convolutional layer
it doesn't work really well with very
small images under 200 pixels wide you
may see artifacts and weird results like
this one appear it seems like you can
also remove wrinkles using the bigger of
scalers which can be a useful artifact
if you are looking to do that other than
that the results are pretty crazy and
for having played with it a lot in the
past few days the four times upscaling
is incredible and you can play with it
too they made the github repo available
for everyone with pre-trained models and
even a demo you can play with right away
without any code of course this was just
an overview of this amazing new model
and i will strongly invite you to read
their paper for a deeper technical
understanding everything is linked in
the description let me know what you
think and i hope you've enjoyed this
video thank you once again weights and
biases for sponsoring this video and to
anyone still watching see you next week
with another exciting paper