tl;dr use the lvm-direct storage driver. lvm-loopback is for prototyping and Does Not Scale I was inspired to write this after reading hilarious posts about how much he loves docker. thehftguy’s At $day_job we use docker on some of our jenkins slaves to build rpms and to support test execution for apps leveraging docker. Naturally this is a dev environment, so we’re not talking about running docker in production or anything Crazy. But we are talking about devs with pitchforks when jenkins jobs hang, and hang, and hang… Everything was running smoothly for the most part. Every couple days a docker node would lock up, cpu and load avg spiked through the roof, reboot it and everything’d be back to normal. And we accepted that for a while. It’s funny what you find when you go diving through logs. wait, what? You mean it’s not as easy as yum install docker? But I thought you said the path to the land of milk and honey was paved with docker? Hmmm :\ So turns out storage with docker is kinda funny. Look for yourself, here’s the storage brochure… https://docs.docker.com/engine/userguide/storagedriver/selectadriver/ It started with AUFS, some heroes at RedHat wrote device mapper plugins (thank you!), now there’s overlay, overlay2, btrfs, zfs, all-kinds-of-fsss a fantastic article explaining the nitty-gritty on the history and development of the device mapper drivers. Here’s The short version is, out-of-the-box docker on a rhel/centos system uses the lvm-loopback storage driver. It’s quick, it’s easy and it works at small scales. If you wanna run real workloads on docker (on rhel/centos) you HAVE to use the lvm-direct driver. lvm-direct uses so it’s not your average lvm setup task. thin provisioning Here’s a quick and dirty to setup the lvm-direct driver on a New node: Add an additional, unpartitioned, un-filesystemed block device to your node OR when you provision the node, leave a percentage of the lvm volume group free (see useful links). Since RedHat published docker 1.12, run the service. This Should detect your additional block device/free vg space and configure the thin-pool automagically. docker-storage-setup That’s it. Start the docker daemon and validate. `docker info 2> /dev/null | grep loop` should return nothing. Run a container and see what happens It’s been 27 days since switching to lvm-direct and no sign of the angry ‘why are my docker jobs hanging’ mob. I suppose the real message here is you can’t just yum install $new_hotness and expect it to work. Perhaps if someone had RTFM we could’ve skipped this whole painful lesson. Naaaa… Some useful links: Friends Don’t Let Friends Run Docker on Loopback in Production Comprehensive Overview of Storage Scalability in Docker Configuring Docker Storage How to Leave Space in a Volume Group Docker and the Device Mapper storage driver
Share Your Thoughts