Or let's write our own init process When your process runs as [PID](https://hackernoon.com/tagged/pid) 1 in a Docker container, signal handling behaves differently to what you might expect. First lets sanity check what happens when a process is not PID 1 on a “normal” system. A simple [Python](https://hackernoon.com/tagged/python) process that just sleeps Aarons-iMac:bin aaronkalair$ cat mypy.py import subprocess subprocess.call(\["sleep", "100"\]) And if we run it and send `SIGTERM` Aarons-iMac:init-proc aaronkalair$ ps -ef | grep python 501 **14013** 6588 0 2:08pm ttys004 0:00.02 **python mypy.py** Aarons-iMac:bin aaronkalair$ **kill 14013** **Terminated: 15** It gets terminated, nothing surprising here And now let’s run it as PID 1 in a Docker container Aarons-iMac:bin aaronkalair$ cat Dockerfile from ubuntu:16.04 RUN apt-get update RUN apt-get install -y python COPY mypy.py /srv/ CMD \["python", "/srv/mypy.py"\] Run this container, exec in and then send the same signal Aarons-iMac:init-proc aaronkalair$ docker exec -it 0229aa205b48 bash root@0229aa205b48:/# ps -ef UID PID PPID C STIME TTY TIME CMD root **1** 0 0 14:15 ? 00:00:00 **python /srv/mypy.py** root 7 1 0 14:15 ? 00:00:00 sleep 100 root@0229aa205b48:/# **kill 1** root@0229aa205b48:/# ps -ef UID PID PPID C STIME TTY TIME CMD root **1** 0 0 14:15 ? 00:00:00 **python /srv/mypy.py** root 7 1 0 14:15 ? 00:00:00 sleep 100 And now nothing happens! Lets try this with a Go process that does something similar package main import ( "time" ) func main() { time.Sleep(time.Duration(100000) \* time.Millisecond) } Pop this into a Docker container, run it, exec in and send it `SIGTERM` Aarons-iMac:init-proc aaronkalair$ docker exec -it e6ccf11be060 bash root@e6ccf11be060:/# ps -ef UID PID PPID C STIME TTY TIME CMD root **1** 0 0 14:28 ? 00:00:00 **./srv/sleep-spawner** root@e6ccf11be060:/# **kill 1** root@e6ccf11be060:/# **Aarons-iMac:init-proc aaronkalair$** And it’s killed, just like it behaves if it wasn’t running as PID 1 So what’s going on here then? Well PID 1 is special in Linux, amongst other things it ignores any signals unless a handler for that signal is explicitly declared. From the Docker docs — [https://docs.docker.com/engine/reference/run/#foreground](https://docs.docker.com/engine/reference/run/#foreground) > **_Note_**_: A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on_ `_SIGINT_` _or_ `_SIGTERM_` _unless it is coded to do so._ We could just define handlers for those signals in every process we want to run in a Docker container but this is a lot of work and we may not have the source code to do so. Furthermore there are other responsibilities for PID 1 that we’ll explore later. So instead we could run a different process as PID 1 and have it proxy signals to the actual process we want to run and perform the other duties of a standard init process There are numerous solutions that do this for example Yelps `dumb-init` — [https://github.com/Yelp/dumb-init](https://github.com/Yelp/dumb-init) `Tini` which is shipped with Docker— [https://docs.docker.com/engine/reference/run/#specify-an-init-process](https://docs.docker.com/engine/reference/run/#specify-an-init-process) And many more which you can find by searching around. But I’m going to write my own… So let's start with the basics I need a program that takes the name of another process to execute and executes it func main() { cmd := exec.Command(os.Args\[1\], os.Args\[2:\]...) err := cmd.Start() if err != nil { panic(err) } err = cmd.Wait() if err != nil { panic(err) } } Some important things to note about how we do this because it will be important later. After we `Start()` the new process we call `Wait()` this is important, this will block until the command exits and once it does cleans up any resources associated with it. Failure to `wait` on a process you spawn leads to zombie processes that hang around once they’ve finished executing consuming some resource. From the man page — [http://man7.org/linux/man-pages/man2/waitpid.2.html#NOTES](http://man7.org/linux/man-pages/man2/waitpid.2.html#NOTES) > A child that terminates, but has not been waited for becomes a "zombie". The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child. As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes. So let's try out our new signal proxy, if we run that in a container… CMD \["./srv/init-proc", "/srv/sleep-spawner", "1"\] We can see that our proxy process is now PID 1 and has spawned off sleep-spawner root@36c4892039db:/# ps -ef UID PID PPID C STIME TTY TIME CMD root **1** 0 0 17:45 ? 00:00:00 **./srv/init-proc /srv/sleep-spawner 1** root **11** 1 0 17:45 ? 00:00:00 **/srv/sleep-spawner 1** Alright the next step is to register ourselves as being interested with all the possible signals func main() { signalChannel := make(chan os.Signal, 2) signal.Notify(signalChannel) pid := -1 go sigHandler(&pid, signalChannel) cmd := exec.Command(os.Args\[1\], os.Args\[2:\]...) err := cmd.Start() pid = cmd.Process.Pid if err != nil { panic(err) } err = cmd.Wait() if err != nil { panic(err) } } With `sigHandler` defined as: func sigHandler(pid \*int, signalChannel chan os.Signal) { var sigToSend syscall.Signal = syscall.SIGHUP for { sig := <-signalChannel switch sig { // #1 - Sent went the controlling terminal is closed, typically used by daemonised processes to reload config case syscall.SIGHUP: sigToSend = syscall.SIGHUP // #2 - Like pressing CTRL+C case syscall.SIGINT: sigToSend = syscall.SIGINT ..... repeat for all signals } **syscall.Kill(\*pid, sigToSend)** } } It simply switches on all the signals Go supports — [https://golang.org/pkg/syscall/#pkg-constants](https://golang.org/pkg/syscall/#pkg-constants) And then uses the `kill`system call to send the signal through to the process that’s being ran. Now let's use it to run our Python program and see if it handles SIGTERM correctly. Aarons-iMac:init-proc aaronkalair$ docker exec -it 579ef1d3ce77 bash root@579ef1d3ce77:/# ps -ef UID PID PPID C STIME TTY TIME CMD root **1** 0 0 18:33 ? 00:00:00 **./srv/init-proc** python /srv/mypy.py root **13** 1 0 18:33 ? 00:00:00 **python /srv/mypy.py** root 14 13 0 18:33 ? 00:00:00 sleep 100 root@579ef1d3ce77:/# **kill 1** root@579ef1d3ce77:/# **Aarons-iMac:init-proc aaronkalair$** And it works! Now let’s take care of another thing PID 1 is responsible for, cleaning up Zombie processes. Imagine this scenario A — spawns -> B — spawns-> C Now if B dies or exits before C, C becomes an orphan process, who is C’s parent now? Well the operating system is responsible for reparenting orphan processes to PID 1, so it now looks like A — parent of -> C Now when C exits A will receive the `SIGCHILD` signal and is responsible for calling `wait` on C to clean up this Zombie process. So lets add this logic to the SIGCHILD case: case syscall.SIGCHLD: var status syscall.WaitStatus var rusage syscall.Rusage **syscall.Wait4(-1, &status, syscall.WNOHANG, &rusage)** sigToSend = syscall.SIGCHLD `-1` Means wait for any child process to change state rather than a specific one as we don’t know the ID of the process that has exited when we get the signal `WNOHANG` Means that if there are no child processes that have changed state don’t block waiting for one, return immediately Performing `wait` on a terminated child cleans up its resources preventing it from remaining a zombie process From the `wait` manpage — [http://man7.org/linux/man-pages/man2/waitpid.2.html](http://man7.org/linux/man-pages/man2/waitpid.2.html) > In the case of a terminated child, performing a wait allows the system to release the resources associated with the child; if a wait is not performed, then the terminated child remains in a "zombie" state Now there’s just one more case to handle imagine: A — spawns -> B — spawns -> C Now C exits but B doesn’t call wait on it A — parent of-> B — parent of-> C (defunct zombie process) `wait` Only works on child processes so no matter how many times our init process A called `wait` it wouldn’t clean up the resources C was using. (And note that `SIGCHILD` would only be sent to B so A wouldn’t even be aware of C exiting) Now B exits A recieves `SIGCHILD` calls `wait` and B is cleaned up nicely. C is now an orphan that gets reparented to A so we have A — parent of -> C (defunct zombie process) We can see the above in action with some modifications to our sleeping program to produce processes where parents exit before there children and don’t call `wait` func main() { MAX\_LEVEL := 4 level, err := strconv.Atoi(os.Args\[1\]) if err != nil { panic(err) } // We'll have a bunch of processes that immediately exit at the max level if level == MAX\_LEVEL { return } // Need the top level to outlive the others, otherwise the container would exit and you wouldn't be able to inspect the process tree sleepTime := 0 if level == 1 { sleepTime = 20000000 } else { // Generate proceses where children sleep for longer than there parents so parents exit first without waiting on the children showing what happens to orphan / zombie processes sleepTime = level \* 1000 } level += 1 for i := 0; i < 2; i++ { // Spawn a command and intentionally dont wait on it err := exec.Command("/srv/sleep-spawner", strconv.Itoa(level)).Start() if err != nil { panic(err) } } time.Sleep(time.Duration(sleepTime) \* time.Millisecond) } It’s available on Github here — [https://github.com/AaronKalair/sleep-spawner](https://github.com/AaronKalair/sleep-spawner) And if we run this we can see what the process tree looks like: Aarons-iMac:init-proc aaronkalair$ docker exec -it 854a232d4b89 bash root@854a232d4b89:/# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 22:13 ? 00:00:00 ./srv/init-proc /srv/sleep-spawner 1 root 12 1 0 22:13 ? 00:00:00 /srv/sleep-spawner 1 root 17 12 0 22:13 ? 00:00:00 **\[sleep-spawner\] <defunct>** root 22 12 0 22:13 ? 00:00:00 **\[sleep-spawner\] <defunct>** root 32 1 0 22:13 ? 00:00:00 **\[sleep-spawner\] <defunct>** With our current implementation this will remain the situation forever, so we need to modify it slightly to handle cases like this: case syscall.SIGCHLD: var status syscall.WaitStatus var rusage syscall.Rusage for { retValue, err := syscall.Wait4(-1, &status, syscall.WNOHANG, &rusage) if err != nil { panic(err) } if retValue <= 0 { break } } sigToSend = syscall.SIGCHLD We take advantage of the return value of `wait4` when used in combination with `WNOHANG` to call it in a loop every time we get a `SIGCHILD` signal. Again from the man page (wait4's return value conforms to waitpid — [http://man7.org/linux/man-pages/man2/waitpid.2.html](http://man7.org/linux/man-pages/man2/waitpid.2.html) ) > on success, returns the process ID of the child whose state has changed; if **WNOHANG** was specified and one or more child(ren) specified by _pid_ exist, but have not yet changed state, then 0 is returned. On error, -1 is returned. So we can sit calling `Wait4` until we get a return value less than or equal to 0 knowing that it’s cleaning up exited processes. Now if we run this and exec inside the container and check with `ps` Aarons-iMac:init-proc aaronkalair$ docker exec -it 30f13d4e53bd bash root@30f13d4e53bd:/# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 22:05 ? 00:00:00 ./srv/init-proc /srv/sleep-spawner 1 root 12 1 0 22:05 ? 00:00:00 /srv/sleep-spawner 1 root 17 12 0 22:05 ? 00:00:00 \[sleep-spawner\] <defunct> root 18 12 0 22:05 ? 00:00:00 \[sleep-spawner\] <defunct> We can see that the zombies parented to PID 1 have now been cleaned up! And there we have it, we’ve made a basic init process that lets us send signals to processes running in Docker containers and have them behave the same way they would outside of a container, and the ability cleanup zombie processes! See the full source code here — [https://github.com/AaronKalair/init-proc](https://github.com/AaronKalair/init-proc) [Follow me on Twitter @AaronKalair](http://twitter.com/aaronkalair)