High Quality 8x Upscaling with AI!

Written by whatsai | Published 2021/11/20
Tech Story Tags: image-processing | image-upsampling | image-super-resolution | ai | artificial-intelligence | technology | innovation | latest-tech-stories | web-monetization

TLDR

SwinIR is a new AI application explained weekly to your emails! Use it to enhance the resolution of an image by a factor of 4 meaning that we have 4 times more height and width pixels for more details. The best thing is that this is done within a few seconds, completely automatically, and works with pretty much any image. Oh and you can even use it yourself with a demo they made available... Watch more results and learn about how it works in the video!Watch the video below.via the TL;DR App

Have you ever had an image you really liked and could only manage to find a small version of it that looked like this image above on the left? How cool would it be if you could take this image and make it twice look as good? It’s great, but what if you could make it even four or eight times more high definition? Now we’re talking, just look at that.

Here we enhanced the resolution of the image by a factor of four, meaning that we have four times more height and width pixels for more details, making it look a lot smoother. The best thing is that this is done within a few seconds, completely automatically, and works with pretty much any image. Oh, and you can even use it yourself with a demo they made available... Watch more results and learn about how it works in the video!

Watch the video

References

►Read the full article: https://www.louisbouchard.ai/swinir/
►Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L. and Timofte, R.,
2021. SwinIR: Image restoration using swin transformer. In Proceedings
of the IEEE/CVF International Conference on Computer Vision (pp.
1833-1844).
►Code: https://github.com/JingyunLiang/SwinIR
►Demo: https://replicate.ai/jingyunliang/swinir
►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/

Video Transcript

00:00

have you ever had an image and really

00:01

liked it but couldn't manage to find a

00:04

better version than this how cool would

00:06

it be if you could take this image and

00:08

make it look twice as good it would be

00:10

great but what if i could make it even

00:12

four or eight times more high definition

00:15

now we are talking just look at that

00:18

here we enhance the resolution of the

00:19

image by a factor of 4 meaning that we

00:22

have 4 times more height and width

00:24

pixels for more details making it look a

00:27

lot smoother the best thing is that this

00:29

is done within a few seconds completely

00:31

automatically and works with pretty much

00:33

any image oh and you can even use it

00:36

yourself with a demo they made available

00:38

as we will see during the video

00:40

speaking of enhancing resolution i'm

00:42

always looking to enhance different

00:44

aspects of how i work and share what i

00:46

make if you are working on machine

00:47

learning problems there is no better way

00:49

to enhance your workflows than with this

00:51

episode sponsor weights and biases waste

00:53

and biases is a ml ups platform where

00:56

you can keep track of your machine

00:57

learning experiments insights and ids a

01:00

feature i especially love is how you can

01:02

quickly create and share amazing looking

01:04

interactive reports like this one

01:06

clearly showing your team or future self

01:08

your runs matrix hyper parameter and

01:10

data configurations alongside any notes

01:13

you had at the time capturing and

01:15

sharing your work is essential if you

01:16

want to grow as a ml practitioner which

01:18

is why i highly recommend using tools

01:20

that improve your work like weights and

01:22

biases just try it with the first link

01:24

below and i will owe you an apology if

01:26

you haven't been promoted within a year

01:29

before getting into this amazing model

01:31

we have to first introduce the concept

01:33

of photo of sampling or image super

01:36

resolution the goal here is to construct

01:38

a high resolution image from a

01:40

corresponding low resolution input image

01:42

which is a face in this case but it can

01:44

be any object animal or landscape the

01:47

low resolution will be such as 512

01:50

pixels or smaller not that blurry but

01:53

it's clearly not high definition when

01:55

you have it full screen just take a

01:57

second to put the video on full screen

01:59

and you'll see the artifacts while we

02:01

are at it you should also take a few

02:02

more seconds to like the video and send

02:05

it to a friend or two i'm convinced they

02:07

will love this and will thank you for it

02:09

anyway we take the slow definition image

02:11

and transform it into a high definition

02:13

image with a much clearer face in this

02:16

case a 2048 pixel square image which is

02:19

4 times more hd to achieve that we

02:22

usually have a typical unit like

02:24

architecture with convolutional neural

02:26

networks which i covered in many videos

02:28

before like the one appearing on the top

02:30

right corner of your screen if you'd

02:32

like to learn more about how they work

02:34

the main downside is that cnns have

02:36

difficulty adapting to extremely broad

02:39

data sets since they have the same

02:40

kernels for all images which makes them

02:43

great for local results and

02:44

generalization but less powerful for the

02:47

overall results when we want the best

02:48

results for each individual image on the

02:51

other hand transformers are promising

02:53

architecture due to the self-attention

02:55

mechanism capturing global interactions

02:57

between contexts for each image but have

03:00

heavy computations that are not suitable

03:02

for images here instead of using cnn's

03:05

or transformers they created the same

03:07

unit-like architecture with both

03:09

convolution and attention mechanisms or

03:12

more precisely using the swin

03:14

transformer architecture the swin

03:16

transformer is amazing since it has both

03:18

the advantages of the cnns to process

03:20

images of larger sizes and prepare them

03:23

for the attention mechanisms and these

03:25

attention mechanisms will create

03:27

long-range connections so that the model

03:29

understands the overall image much

03:31

better and can also recreate the same

03:33

image in a better way i won't enter into

03:36

the details of the swin transformer as i

03:38

already covered this architecture a few

03:40

months ago and explain its difference

03:42

with cnns and classical transformer

03:44

architectures used in natural language

03:46

processing if you'd like to learn more

03:47

about it and how the researchers applied

03:49

transformers to vision check out the

03:52

video and come back for the explanation

03:54

of the subsampling model the model is

03:56

called swin ir and can do many tasks

03:59

which include image of sampling as i

04:01

said it uses convolutions to allow for

04:03

bigger images more precisely they use a

04:06

convolutional layer to reduce the size

04:08

of the image which you can see here this

04:10

reduced image is then sent into the

04:12

model and also passed directly to the

04:15

reconstruction module to give the model

04:17

general information about the image as

04:20

we will see in a few seconds this

04:21

representation will basically look like

04:23

many weird blurry versions of the image

04:26

giving valuable information to the

04:28

upscaling module and how the overall

04:30

image should look like then we see the

04:33

swing transformer layers coupled with

04:35

convolutions this is to compress the

04:37

image further and always extract more

04:39

valuable precise information about both

04:42

the style and details while forgetting

04:44

about the overall image this is why we

04:46

then add the convoluted image to give

04:48

the overall information we lack with a

04:51

skip connection all at this is finally

04:53

sent into a reconstruction module called

04:55

subpixel which looks like this and uses

04:58

both the larger general features and

05:01

smaller detailed features we just

05:03

created to reconstruct a higher

05:05

definition image you can see this as a

05:07

convolutional neural network but in

05:09

reverse or simply a decoder taking the

05:12

condensed features we have and

05:14

reconstructing a bigger image from it

05:16

again if you'd like to learn more about

05:18

cnns and decoders you should check some

05:20

of the videos i made covering them so

05:22

you basically send your image in a

05:24

convolutional layer take this new

05:26

representation save it for later while

05:29

also sending it in the swin transformer

05:31

architecture to condense the information

05:33

further and learn the most important

05:35

features to reconstruct then you take

05:38

these new features with the saved ones

05:40

and use a decoder to reconstruct the

05:42

high definition version and voila now

05:45

you only need enough data and you will

05:47

have results like this

05:54

[Music]

05:59

of course as with all research there are

06:01

some limitations in this case probably

06:03

due to the initial convolutional layer

06:06

it doesn't work really well with very

06:07

small images under 200 pixels wide you

06:10

may see artifacts and weird results like

06:12

this one appear it seems like you can

06:14

also remove wrinkles using the bigger of

06:17

scalers which can be a useful artifact

06:20

if you are looking to do that other than

06:21

that the results are pretty crazy and

06:24

for having played with it a lot in the

06:26

past few days the four times upscaling

06:28

is incredible and you can play with it

06:30

too they made the github repo available

06:32

for everyone with pre-trained models and

06:35

even a demo you can play with right away

06:37

without any code of course this was just

06:40

an overview of this amazing new model

06:42

and i will strongly invite you to read

06:44

their paper for a deeper technical

06:46

understanding everything is linked in

06:48

the description let me know what you

06:49

think and i hope you've enjoyed this

06:52

video thank you once again weights and

06:54

biases for sponsoring this video and to

06:56

anyone still watching see you next week

06:59

with another exciting paper

Written by whatsai | I explain Artificial Intelligence terms and news to non-experts.

Published by HackerNoon on 2021/11/20