DALL·E 2 Pre-Training Mitigations

Video Transcript

you've all seen amazing looking images

like these entirely generated by an

artificial intelligence model i covered

multiple approaches on my channel like

crayon imogen and the most well-known

deli 2. most people want to try them and

generate images from random prompts but

the majority of these models aren't open

source which means regular people like

us cannot use them freely why this is

what we will dive into in this video

i said most of them were not open source

well crayon is and people have generated

amazing memes using it you can see how

such a model can become dangerous

allowing anyone to generate anything not

only for the possible misuses regarding

the generations but the data used to

train such models as well coming from

random images on the internet pretty

much anything with questionable content

and producing some unexpected images the

training data could also be retrieved

through inverse engineering of the model

which is most likely unwanted openai

also used this to justify not releasing

the daily2 model to the public here we

will look into what they are

investigating as potential risks and how

they are trying to mitigate them i go

through a very interesting article they

wrote covering their data pre-processing

steps when training dalit ii but before

so allow me a few seconds to be my own

sponsor and share my most recent project

which might interest you i recently

created a daily newsletter sharing ai

news and research with a simple and

clear one-liner to know if the paper

code or news is worth your time you can

subscribe to it on linkedin or with your

email the link is in the description

below

so what does openai really have in mind

when they say that they are making

efforts to reduce risks

first and the most obvious one is that

they are filtering out violent and

sexual images from the hundreds of

millions of images on the internet this

is to prevent the modal from learning

how to produce violent and sexual

content or even return the original

images as generations it's like not

teaching your kid how to fight if you

don't want him to get into fights it

might help but it's far from a perfect

fix still i believe it's necessary to

have such filters in our data sets and

definitely helps in this case but how do

they do that exactly they build several

models trained to classify data to be

filtered or not by giving them a few

different positive and negative examples

and iteratively improve the classifiers

with human feedback each classifier went

through our whole data set deleting more

images than needed just in case as it's

much better for the model to not see bad

data in the first place rather than

trying to correct the shot afterward

each classifier will have a unique

understanding of which content to filter

and will all complement themselves

ensuring good filtering if by good we

mean no false negative images going

through the filtering process

still it comes with downsides first the

data set is clearly smaller and may not

accurately represent the real world

which may be good or bad depending on

the use case they also found an

unexpected side effect of this data

filtering process it amplified the

model's biases towards certain

demographics introducing the second

thing openai is doing as a pre-training

mitigation reduce the biases caused by

this filtering for example after

filtering one of the biases they noticed

was that the modal generated more images

of men and fewer of women compared to

modals trained on the original data set

they explained that one of the reasons

may be that women appear more often than

men in sexual content which may bias

their classifiers to remove more false

negative images containing women from

the data set creating a gap in the

gender ratio that the model observes in

training and replicates to fix that they

re-weight the filtered data set to match

the distribution of the initial

pre-filter data set here is an example

they cover using cats and dogs where the

filter will remove more dugs then cats

so the fix will be to double the

training loss for images of dogs which

will be like sending two images of dugs

instead of one and compensating for the

lack of images this is once again just a

proxy for actual filtering bias but it

still reduces the image distribution gap

between the pre-filtered and the

filtered data set

the last issue is an issue of

memorization something the models seem

to be much more powerful than i am as we

said it's possible to regurgitate the

training data from such image generation

models which is not wanted in most cases

here we also want to generate novel

images and not simply copy paste images

from the internet but how can we prevent

that just like our memory you cannot

really decide what you remember and what

goes away once you see something it

either sticks or it doesn't they found

that just like humans learning a new

concept if the model sees the same image

numerous times in the data set it may

accidentally know it by heart at the end

of its training and generate it exactly

for a similar or identical text prompt

this one is an easy and reliable fix

just find out which images are too

similar and delete the duplicates easy

doing this will mean comparing each

image with every other image meaning

hundreds of quadrillions of image pairs

to compare instead they simply start by

grouping similar images together and

then compare the images with all other

images within the same and a few other

clusters around it immensely reducing

the complexity while still finding 97 of

all duplicate pairs again another fix to

do within the data set before training

our daily model openai also mentions

some next step they are investigating

and if you've enjoyed this video i

definitely invite you to read their

in-depth article to see all the details

of this pre-training mitigation work

it's a very interesting and well-written

article let me know what you think of

their mitigation efforts and their

choice to limit the model's access to

the public

leave a comment or join the discussion

in our community on discord thank you

for watching until the end and i will

see you next week with another amazing

paper

0