How can Uber deliver food and always arrive on time or a few minutes before? How do they match riders to drivers so that you can always find a Uber? All that while also managing all the drivers?!
Well, we will answer exactly that in the video...
►Read the full article: https://www.louisbouchard.ai/uber-deepeta/
►Uber blog post: https://eng.uber.com/deepeta-how-uber-predicts-arrival-times/
►What are transformers:
►Linear Transformers: https://arxiv.org/pdf/2006.16236.pdf
►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/
0:00
how can uber deliver food and always
0:02
arrive on time or a few minutes before
0:05
how do they match riders to drivers so
0:07
that you can always find a uber all that
0:10
while soon managing all the drivers we
0:12
will answer these questions in this
0:14
video with their arrival time prediction
0:16
algorithm deep eta deep eta is uber's
0:20
most advanced algorithm for estimating
0:22
arrival times using deep learning used
0:25
both for uber and uber eats deep eta can
0:28
magically organize everything in the
0:30
background so that riders drivers and
0:32
food are fluently going from point a to
0:34
point b as efficiently as possible many
0:37
different algorithms exist to estimate
0:40
travel on such road networks but i don't
0:42
think any are as optimized as uber's
0:45
previous arrival time prediction tools
0:47
including uber were built with what we
0:50
call shortest path algorithms which are
0:52
not well suited for real-world
0:54
predictions since they do not consider
0:56
real-time signals for several years uber
0:59
used xgboost a well-known gradient
1:02
boosted decision tree machine learning
1:04
library xjboost is extremely powerful
1:07
and used in many applications but was
1:09
limited in uber's case as the more it
1:11
grew the more latency it had they wanted
1:14
something faster more accurate and more
1:16
general to be used for drivers riders
1:18
and food delivery all orthogonal
1:20
challenges that are complex to solve
1:22
even for machine learning or ai
1:25
here comes deep eta a deep learning
1:28
model that improves upon xg boosts for
1:30
all of those oh and i almost forgot
1:33
here's the sponsor of this video
1:36
myself please take a minute to subscribe
1:39
if you like the content and leave a like
1:41
i'd also love to read your thoughts in
1:43
the comments or join the discord
1:45
community learn ai together to chat with
1:47
us let's get back to the video
1:49
deep eta is really powerful and
1:51
efficient because it doesn't simply take
1:53
data and generate a prediction there's a
1:56
whole preprocessing system to make this
1:58
data more digestible for the model this
2:00
makes it much easier for the model as it
2:02
can directly focus on optimized data
2:05
with much less noise and far smaller
2:07
inputs a first step in optimizing for
2:10
latency issues this pre-processing
2:12
module starts by taking map data and
2:14
real-time traffic measurements to
2:16
produce an initial estimated time of
2:18
arrival for any new customer request
2:21
then the model takes in these
2:23
transformed features with the spatial
2:25
origin and destination and time of the
2:27
request as a temporal feature but it
2:29
doesn't stop here it also takes more
2:32
information about real-time activities
2:34
like traffic weather or even the nature
2:36
of the request like delivery or ride
2:39
share pickup all this extra information
2:41
is necessary to improve from the
2:43
shortest path algorithms we mentioned
2:45
that are highly efficient but far from
2:47
intelligent are real world proof and
2:50
what kind of architecture does this
2:52
model use you guessed it a transformer
2:54
are you surprised because i'm definitely
2:56
not and this directly answers the first
2:59
challenge which was to make the model
3:01
more accurate i've already covered
3:03
transformers numerous times on my
3:04
channel so i won't go into how it works
3:07
in this video but i still wanted to
3:08
highlight a few specific features for
3:11
this one in particular first you must be
3:13
thinking but transformers are huge and
3:16
slow models how can it be of lower
3:18
latency than xg boost well you will be
3:21
right they've tried it and it was too
3:23
slow so they had to make some changes
3:26
the change with the biggest impact was
3:28
to use a linear transformer which scales
3:30
with the dimension of the input instead
3:33
of the input's length this means that if
3:35
the input is long transformers will be
3:38
very slow and this is often the case for
3:40
them with as much information as routing
3:42
data instead it scales with dimensions
3:45
something they can control that is much
3:47
smaller another great improvement in
3:49
speed is the discretization of inputs
3:52
meaning that they take continuous values
3:53
and make them much easier to compute by
3:56
clustering similar values together
3:58
discretization is regularly used in
4:00
prediction to speed up computation as
4:02
the speed it gives clearly outweighs the
4:04
error that duplicates values may bring
4:07
now there is one challenge left to cover
4:10
and by far the most interesting is how
4:13
they made it more general here is the
4:15
complete deep eta model to answer this
4:18
question there is the earlier
4:19
quantization of the data that are then
4:22
embedded and sent to the linear
4:24
transformer we just discussed then we
4:26
have the fully connected layer to make
4:28
our predictions and we have a final step
4:31
to make our model general the bias
4:33
adjustment decoder it will take in the
4:36
predictions and the type features we
4:38
mentioned at the beginning of the video
4:40
containing the reason the customer made
4:42
a request to uber to a render prediction
4:44
to a more appropriate value for a task
4:46
they periodically retrain and deploy
4:49
their model using their own platform
4:51
called michelangelo which i'd love to
4:53
cover next if you're interested if so
4:56
please let me know in the comments and
4:58
voila this is what uber currently use in
5:01
their system to deliver and give rides
5:03
to everyone as efficiently as possible
5:07
of course this was only an overview and
5:09
they used more techniques to improve the
5:11
architecture which you can find out in
5:13
their great blog post linked below if
5:16
you're curious i also just wanted to
5:18
note that this was just an overview of
5:20
their arrival time prediction algorithm
5:22
and i am in no way affiliated with uber
5:25
i hope you enjoyed this week's video
5:28
covering a model applied to the real
5:30
world instead of a new research paper
5:32
and if so please feel free to suggest
5:35
any interesting applications or tools to
5:37
cover next i'd love to read your ids
5:39
thank you for watching and i will see
5:41
you next week with another amazing paper
[Music]