Why didn’t Lambda ship Docker out of the box? Is it because of security concerns? Why even bother, let’s do it ourselves!
Lambda is pioneering the serverless market. Look at the chart below:
🤔 What happened on Jan 8th?
I know, it’s just search activity. But you can see how much ahead of time it was in 2015.
Monopoly isn’t good for the end user. No competition = stagnating development. A year later Azure popped up with function bindings & Logic Apps. Google revealed simpler dependency management and nice dashboard.
And recently I discovered Apache OpenWhisk. This is an open-source alternative to run your own serverless platform. And unlike everything else, it can run functions from Docker images you supply.
You may ask, “Why complicate? Why Docker? Serverless is about simple functions.”. Sure, but the reality is it’s not always that simple.
**zip**
The entire world shifted to containers, and now AWS tells you to pack all your code with dependencies into 1 zip. A step back IMHO. ¯\_(ツ)_/¯
With Docker you build functions in the same way as you did for years with microservices.
Ever wanted to use bcrypt, phantom.js, ffmpeg or any other binary in your functions? Launch EC2 instance with a specific version of Amazon Linux, compile binaries, squeeze everything under 50MB and you’re good to go!
Small tip: if binary is too big, you can download it at the runtime to /tmp
. Network speed is fantastic, but watch out for cold start overhead.
With Docker you would throw it in Dockerfile and let the magic happen.
As for June 2017 officially supported are: Node.js, Python, Java (Groovy, Scala), C#.
Unofficially: Go, Closure, PHP, Ruby, Haskell, Swift, F# and some others.
Using unofficial language is cumbersome, similarly to compiling binaries.
With Docker it’s simple as FROM brainfuck:latest
.
It’s a plain docker image with your code inside. Sure, there is some custom logic which rely on FaaS API, but I feel better already.
With Docker it doesn’t matter where do you docker pull
— Google or OpenWhisk.
Okay, so enough theory. I hope you share some of my excitement now.
To begin with, I need to know what kind of environment Lambda is running in. I want to look around, try things to see is it possible to run docker daemon there.
For my experiments I used a handy tool called lambdash from Eric Hammond.
Did you know Eric is officially AWS Community Hero?
You can run any command there. Seriously, check it out.
So I started playing with absurd commands a-la sudo apt-get install docker
and so on. No surprise it didn’t work out.
Then I tried to install Docker from static binaries. Downloading and unpacking went fine, unless I tried to start it. Docker needs root access to start a daemon 😓. Of course it’s not available in such a limited environment as Lambda.
So how much is it limited?
/tmp
My brain replied “NO”, but heart was telling “GO RESEARCH, VLAD”.
So I did the research. What are the options?
1st option is not truly “serverless”. It requires some real servers running. So I was about to choose 2nd, unless one random day I found udocker.
Only 117 stars on GitHub. This gem deserves more!
Execute docker containers without root privileges.
This is insane.
Needles to say this project concluded my success. These smart guys figured out you can unpack docker image and execute it in an emulated environment. Well done.
You can find all the code on GitHub. But before some disclaimers:
In Lambda, the only place you are allowed to write is /tmp
. But udocker will attempt to write to the homedir by default. I need to change its mind.
export HOME=/tmpexport UDOCKER_DIR=/tmpexport UDOCKER_BIN=/tmpexport UDOCKER_LIB=/tmpexport UDOCKER_CONTAINERS=/tmp
Voilà. You home is in /tmp now. Do whatever you want.
Next, let’s download udocker python script.
$ cd /tmp$ curl https://raw.githubusercontent.com/indigo-dc/udocker/udocker-fr/udocker.py > udocker$ python udocker version
If everything went well, you’ll see something like:
Info: creating repo: /tmpInfo: installing from tarball 1.1.0-RC2Info: downloading: https://owncloud.indigo-datacl...udocker 1.1.0-RC2
I show only what code did I run. You choose how to run it. I used lambdash
for the sake of simplicity. You may want to spawn a script from Node.js/Java/Python code you write for your functions.
This will download image from Docker Hub to your /tmp.
$ python udocker pull ubuntu:17.04
Next step is important. You need to create a container AND set up the execution engine mode. Since PRoot isn’t working in Lambda (bug), I tried the second option.
Plenty of options to choose, not sure what all they mean. Help me, linux experts.
$ python udocker create --name=ubuntu ubuntu:17.04$ python udocker setup --execmode=F1 ubuntu
Finally, hold your breath and run it.
$ python udocker run --nosysdirs ubuntu cat /etc/lsb-release
DISTRIB_ID=UbuntuDISTRIB_RELEASE=17.04DISTRIB_CODENAME=zestyDISTRIB_DESCRIPTION="Ubuntu 17.04"
That’s all. This is a small proof of concept. But you got the idea. Go and experiment with images and tell me what you think in the comments below.
It’s pain in the ass. Downloading udocker with docker image every 4 hours. Sacrificing startup time. Clog your /tmp up to 100% and so on. (Not anymore, read UPD below).
But possible if you have a specific use-case. Who knows, maybe this post will facilitate the native Docker support from AWS (#awswishlist I am looking at you).
I believe it will inspire community to build next crazy ideas. Like these folks inspired me:
SSH-ing into your AWS Lambda Functions_Finally proof that serverless has servers?_medium.com
How to get headless Chrome running on AWS Lambda_An adventure in getting Chrome (read: Chromium) to run “serverless-ly” from compiling it to deploying it on AWS Lambda._medium.com
P.S. Thanks for reading. Click 💚 and subscribe if you want more ;)
P.P.S. UPD: A month before this post, Germán Moltó has developed a framework implementing similar ideas of running Docker in Lambda!
Meet the SCAR — Serverless Container-aware ARchitectures!
grycap/scar_scar - Serverless Container-aware ARchitectures (e.g. Docker containers in AWS Lambda)_github.com