ChatGPT has taken over Twitter and pretty much the whole internet, thanks to its power and the meme potential it provides. We all know how being able to generate memes is the best way to conquer the internet, and so it worked.
Since you’ve seen numerous examples, you might already know that ChatGPT is an AI recently released to the public by OpenAI, that you can chat with. It is also called a chatbot, meaning you can interact with it conversationally, imitatting a one-on-one human discussion.
What you might not know is what it is and how it works... Watch the video to learn more!
►Try it: https://chat.openai.com/
►OpenAI's blog post: https://openai.com/blog/chatgpt/
►What is GPT-3:
►What is Reinforcement Learning:
►Join our Discord community: https://www.louisbouchard.ai/learn-ai-together/
►Twitter: https://twitter.com/Whats_AI
►Support me on Patreon: https://www.patreon.com/whatsai
0:00
you've seen it everywhere Chad GPT has
0:02
taken on Twitter and pretty much the
0:04
whole internet thanks to its power and
0:06
the meme potential it provides we all
0:08
know being able to generate memes is the
0:11
best way to conquer the internet and so
0:13
it worked since you've seen numerous
0:14
examples you might already know that
0:16
chatgpt is an AI recently released to
0:19
the public by openai allowing you to
0:21
chat with it it's also called a chatbot
0:24
meaning you can interact with it
0:25
conversionally imitating a one-on-one
0:28
human discussion what you might not know
0:30
is what it is and how it works
0:32
chadjupiti is a model based on
0:35
reinforcement learning and the GPT
0:37
series of models from openai I will
0:39
refer you to a video about reinforcement
0:41
learning we recently published with my
0:43
friend Elias to learn more about the
0:46
subfield of AI but quickly reinforcement
0:48
learning is a way to train algorithms by
0:51
trial and error aiming for rewards just
0:54
like humans would do by learning with
0:56
positive feedback more specifically chat
0:58
GPT was built following three steps the
1:02
first was to take an already powerful
1:04
model and fine tune it with supervised
1:06
learning what does this mean it means
1:08
that they took a model specifically its
1:11
GPT 3.5 and improved and up-to-date
1:14
version of gpd3 which they trained once
1:17
more on conversation examples
1:19
specifically instead of being trained on
1:21
pretty much their whole internet as gpt3
1:24
was this means they are trying to narrow
1:26
its potentials strictly to conversations
1:28
making it theoretically better at
1:31
conversing compared to gpt3 since a
1:34
specialist is almost always better than
1:36
a generalist at a specific task if you
1:38
are still not familiar with the GPT
1:40
series of models I would suggest
1:42
watching the short introduction video I
1:44
made covering gpt3 when it came out the
1:47
second step is to add our reinforcement
1:49
learning magic which will allow the
1:51
model to practice and get better as you
1:53
know practice makes perfect more
1:55
precisely in this step we will use the
1:57
model to chat with humans directly have
2:00
it provide multiple possible answers and
2:03
ask the human to rate the answers from
2:05
best to worst this data will then be
2:07
used to train another model called our
2:10
reward model learning to replicate our
2:12
human annotators this leads to our last
2:15
step where our new reward model will
2:18
give feedback to the chat GPT model's
2:20
answers as a reward function to help it
2:22
converge toward the best answers over
2:24
time this final step is to further train
2:27
our algorithm after the initial fine
2:30
tuning step we explained this is why it
2:32
is companies like open AI that release
2:34
those kinds of amazingly powerful models
2:37
it will be unfeasible for universities
2:39
or individuals as it requires way too
2:42
much Computing and time for training
2:44
still what they achieve is quite
2:46
remarkable and I believe they are worth
2:48
doing and worth sharing to Advent
2:50
science and voila after coupling the
2:54
already powerful and most recent GPT
2:56
based language model fine-tuning it to
2:59
conversations and finally using
3:01
reinforcement learning to make it
3:03
practice its conversation skills you
3:05
obtain chat GPD as you have seen before
3:07
the model is quite promising but also
3:10
sometimes a very dumb and doesn't seem
3:12
to have any logic whatsoever it is still
3:15
just an algorithm and far from being
3:17
either intelligent or conscious though
3:20
it will depend on how we Define both it
3:22
definitely has its limitations
3:24
nonetheless the outputs it gives are
3:27
often surprisingly interesting and
3:29
pertinent chatgpt is definitely a step
3:31
forward in conversational Ai and quite
3:34
promising especially working on the
3:36
prompt engineering side of the model to
3:38
leverage its true potential and limit
3:41
failure cases I hope you've enjoyed this
3:43
video and I'd love to see your
3:45
experiments please tag me on Twitter at
3:47
what's AI if you share them or join our
3:50
Discord Community where we created a
3:52
channel specifically for it I will see
3:54
you next week with another amazing AI
3:57
research
4:00
foreign
4:04
[Music]