Or let's write our own init process When your process runs as 1 in a Docker container, signal handling behaves differently to what you might expect. PID First lets sanity check what happens when a process is not PID 1 on a “normal” system. A simple process that just sleeps Python Aarons-iMac:bin aaronkalair$ cat mypy.pyimport subprocess subprocess.call(["sleep", "100"]) And if we run it and send SIGTERM Aarons-iMac:init-proc aaronkalair$ ps -ef | grep python501 6588 0 2:08pm ttys004 0:00.02 14013 python mypy.py Aarons-iMac:bin aaronkalair$ kill 14013 Terminated: 15 It gets terminated, nothing surprising here And now let’s run it as PID 1 in a Docker container Aarons-iMac:bin aaronkalair$ cat Dockerfilefrom ubuntu:16.04 RUN apt-get updateRUN apt-get install -y pythonCOPY mypy.py /srv/ CMD ["python", "/srv/mypy.py"] Run this container, exec in and then send the same signal Aarons-iMac:init-proc aaronkalair$ docker exec -it 0229aa205b48 bash root@0229aa205b48:/# ps -efUID PID PPID C STIME TTY TIME CMDroot 0 0 14:15 ? 00:00:00 root 7 1 0 14:15 ? 00:00:00 sleep 100 1 python /srv/mypy.py root@0229aa205b48:/# kill 1 root@0229aa205b48:/# ps -efUID PID PPID C STIME TTY TIME CMDroot 0 0 14:15 ? 00:00:00 root 7 1 0 14:15 ? 00:00:00 sleep 100 1 python /srv/mypy.py And now nothing happens! Lets try this with a Go process that does something similar package main import ("time") func main() {time.Sleep(time.Duration(100000) * time.Millisecond)} Pop this into a Docker container, run it, exec in and send it SIGTERM Aarons-iMac:init-proc aaronkalair$ docker exec -it e6ccf11be060 bash root@e6ccf11be060:/# ps -efUID PID PPID C STIME TTY TIME CMDroot 0 0 14:28 ? 00:00:00 1 ./srv/sleep-spawner root@e6ccf11be060:/# kill 1 root@e6ccf11be060:/# Aarons-iMac:init-proc aaronkalair$ And it’s killed, just like it behaves if it wasn’t running as PID 1 So what’s going on here then? Well PID 1 is special in Linux, amongst other things it ignores any signals unless a handler for that signal is explicitly declared. From the Docker docs — https://docs.docker.com/engine/reference/run/#foreground Note : A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on _SIGINT_ or _SIGTERM_ unless it is coded to do so. We could just define handlers for those signals in every process we want to run in a Docker container but this is a lot of work and we may not have the source code to do so. Furthermore there are other responsibilities for PID 1 that we’ll explore later. So instead we could run a different process as PID 1 and have it proxy signals to the actual process we want to run and perform the other duties of a standard init process There are numerous solutions that do this for example Yelps — dumb-init https://github.com/Yelp/dumb-init which is shipped with Docker— Tini https://docs.docker.com/engine/reference/run/#specify-an-init-process And many more which you can find by searching around. But I’m going to write my own… So let's start with the basics I need a program that takes the name of another process to execute and executes it func main() {cmd := exec.Command(os.Args[1], os.Args[2:]...)err := cmd.Start()if err != nil {panic(err)}err = cmd.Wait()if err != nil {panic(err)}} Some important things to note about how we do this because it will be important later. After we the new process we call this is important, this will block until the command exits and once it does cleans up any resources associated with it. Start() Wait() Failure to on a process you spawn leads to zombie processes that hang around once they’ve finished executing consuming some resource. wait From the man page — http://man7.org/linux/man-pages/man2/waitpid.2.html#NOTES A child that terminates, but has not been waited for becomes a "zombie". The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child. As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes. So let's try out our new signal proxy, if we run that in a container… CMD ["./srv/init-proc", "/srv/sleep-spawner", "1"] We can see that our proxy process is now PID 1 and has spawned off sleep-spawner root@36c4892039db:/# ps -efUID PID PPID C STIME TTY TIME CMDroot 0 0 17:45 ? 00:00:00 root 1 0 17:45 ? 00:00:00 1 ./srv/init-proc /srv/sleep-spawner 1 11 /srv/sleep-spawner 1 Alright the next step is to register ourselves as being interested with all the possible signals func main() {signalChannel := make(chan os.Signal, 2)signal.Notify(signalChannel)pid := -1 go sigHandler(&pid, signalChannel) cmd := exec.Command(os.Args\[1\], os.Args\[2:\]...) err := cmd.Start() pid = cmd.Process.Pid if err != nil { panic(err) } err = cmd.Wait() if err != nil { panic(err) } } With defined as: sigHandler func sigHandler(pid *int, signalChannel chan os.Signal) {var sigToSend syscall.Signal = syscall.SIGHUPfor {sig := <-signalChannelswitch sig {// #1 - Sent went the controlling terminal is closed, typically used by daemonised processes to reload configcase syscall.SIGHUP:sigToSend = syscall.SIGHUP// #2 - Like pressing CTRL+Ccase syscall.SIGINT:sigToSend = syscall.SIGINT.....repeat for all signals} }} syscall.Kill(*pid, sigToSend) It simply switches on all the signals Go supports — https://golang.org/pkg/syscall/#pkg-constants And then uses the system call to send the signal through to the process that’s being ran. kill Now let's use it to run our Python program and see if it handles SIGTERM correctly. Aarons-iMac:init-proc aaronkalair$ docker exec -it 579ef1d3ce77 bash root@579ef1d3ce77:/# ps -efUID PID PPID C STIME TTY TIME CMDroot 0 0 18:33 ? 00:00:00 python /srv/mypy.pyroot 1 0 18:33 ? 00:00:00 root 14 13 0 18:33 ? 00:00:00 sleep 100 1 ./srv/init-proc 13 python /srv/mypy.py root@579ef1d3ce77:/# kill 1 root@579ef1d3ce77:/# Aarons-iMac:init-proc aaronkalair$ And it works! Now let’s take care of another thing PID 1 is responsible for, cleaning up Zombie processes. Imagine this scenario A — spawns -> B — spawns-> C Now if B dies or exits before C, C becomes an orphan process, who is C’s parent now? Well the operating system is responsible for reparenting orphan processes to PID 1, so it now looks like A — parent of -> C Now when C exits A will receive the signal and is responsible for calling on C to clean up this Zombie process. SIGCHILD wait So lets add this logic to the SIGCHILD case: case syscall.SIGCHLD:var status syscall.WaitStatusvar rusage syscall.Rusage sigToSend = syscall.SIGCHLD syscall.Wait4(-1, &status, syscall.WNOHANG, &rusage) Means wait for any child process to change state rather than a specific one as we don’t know the ID of the process that has exited when we get the signal -1 Means that if there are no child processes that have changed state don’t block waiting for one, return immediately WNOHANG Performing on a terminated child cleans up its resources preventing it from remaining a zombie process wait From the manpage — wait http://man7.org/linux/man-pages/man2/waitpid.2.html In the case of a terminated child, performing a wait allows the system to release the resources associated with the child; if a wait is not performed, then the terminated child remains in a "zombie" state Now there’s just one more case to handle imagine: A — spawns -> B — spawns -> C Now C exits but B doesn’t call wait on it A — parent of-> B — parent of-> C (defunct zombie process) Only works on child processes so no matter how many times our init process A called it wouldn’t clean up the resources C was using. (And note that would only be sent to B so A wouldn’t even be aware of C exiting) wait wait SIGCHILD Now B exits A recieves calls and B is cleaned up nicely. SIGCHILD wait C is now an orphan that gets reparented to A so we have A — parent of -> C (defunct zombie process) We can see the above in action with some modifications to our sleeping program to produce processes where parents exit before there children and don’t call wait func main() {MAX_LEVEL := 4 level, err := strconv.Atoi(os.Args[1])if err != nil {panic(err)} // We'll have a bunch of processes that immediately exit at the max levelif level == MAX_LEVEL {return} // Need the top level to outlive the others, otherwise the container would exit and you wouldn't be able to inspect the process treesleepTime := 0if level == 1 {sleepTime = 20000000} else {// Generate proceses where children sleep for longer than there parents so parents exit first without waiting on the children showing what happens to orphan / zombie processessleepTime = level * 1000} level += 1for i := 0; i < 2; i++ {// Spawn a command and intentionally dont wait on iterr := exec.Command("/srv/sleep-spawner", strconv.Itoa(level)).Start()if err != nil {panic(err)}}time.Sleep(time.Duration(sleepTime) * time.Millisecond)} It’s available on Github here — https://github.com/AaronKalair/sleep-spawner And if we run this we can see what the process tree looks like: Aarons-iMac:init-proc aaronkalair$ docker exec -it 854a232d4b89 bashroot@854a232d4b89:/# ps -efUID PID PPID C STIME TTY TIME CMDroot 1 0 0 22:13 ? 00:00:00 ./srv/init-proc /srv/sleep-spawner 1root 12 1 0 22:13 ? 00:00:00 /srv/sleep-spawner 1root 17 12 0 22:13 ? 00:00:00 root 22 12 0 22:13 ? 00:00:00 root 32 1 0 22:13 ? 00:00:00 [sleep-spawner] <defunct> [sleep-spawner] <defunct> [sleep-spawner] <defunct> With our current implementation this will remain the situation forever, so we need to modify it slightly to handle cases like this: case syscall.SIGCHLD:var status syscall.WaitStatusvar rusage syscall.Rusagefor {retValue, err := syscall.Wait4(-1, &status, syscall.WNOHANG, &rusage)if err != nil {panic(err)}if retValue <= 0 {break}}sigToSend = syscall.SIGCHLD We take advantage of the return value of when used in combination with to call it in a loop every time we get a signal. wait4 WNOHANG SIGCHILD Again from the man page (wait4's return value conforms to waitpid — ) http://man7.org/linux/man-pages/man2/waitpid.2.html on success, returns the process ID of the child whose state has changed; if was specified and one or more child(ren) specified by exist, but have not yet changed state, then 0 is returned. On error, -1 is returned. WNOHANG pid So we can sit calling until we get a return value less than or equal to 0 knowing that it’s cleaning up exited processes. Wait4 Now if we run this and exec inside the container and check with ps Aarons-iMac:init-proc aaronkalair$ docker exec -it 30f13d4e53bd bashroot@30f13d4e53bd:/# ps -efUID PID PPID C STIME TTY TIME CMDroot 1 0 0 22:05 ? 00:00:00 ./srv/init-proc /srv/sleep-spawner 1root 12 1 0 22:05 ? 00:00:00 /srv/sleep-spawner 1root 17 12 0 22:05 ? 00:00:00 [sleep-spawner] <defunct>root 18 12 0 22:05 ? 00:00:00 [sleep-spawner] <defunct> We can see that the zombies parented to PID 1 have now been cleaned up! And there we have it, we’ve made a basic init process that lets us send signals to processes running in Docker containers and have them behave the same way they would outside of a container, and the ability cleanup zombie processes! See the full source code here — https://github.com/AaronKalair/init-proc Follow me on Twitter @AaronKalair