In the last couple of weeks I’ve given talks at and where I’ve shown how a container works by building one from scratch. DockerCon Craft Conference When I run my home-grown container it has always slightly bothered me that there are more Linux processes created than I can account for. Someone in the audience spotted it too, and asked why there are more processes than we can see in ps. $ go run main.go run /bin/bash Running [/bin/bash] as PID 21569 Running [/bin/bash] as PID 1 root@container:/# ps PID TTY TIME CMD 1 ? 00:00:00 exe 4 ? 00:00:00 bash 9 ? 00:00:00 ps root@container:/# My does a fork/exec to run /proc/self/exe within a new set of namespaces, which is to say, it runs the same program again within these namespaces. This explains the process with ID 1 that’s running exe. code This time the program is given a different command ( instead of ) which causes it to fork/exec to run whatever arbitrary command it has been given — in this case /bin/bash. As a child process this inherits the same set of namespaces as its parent. child run We can see bash in the process list, but why is it given process ID 4? This happens every single time. What happens to processes 2 and 3? Inspecting syscalls with strace To find out, I’m going to run my container code under the system call tracing utility, strace. First I want to compile the code; in talks I usually invoke the code with go run main.go ... to save having a separate compile step, but I don’t want to the compilation. strace $ go build -o container . I can now invoke the code with ./container run <cmd> <args>. So: $ strace ./container run echo hello Which tells us there are quite a lot of syscalls being called! Let’s grep for clone which is the syscall that creates a new process. $ strace ./container run echo hello 2>&1 | grep clone clone(child_stack=0xc820033fc0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 21928 clone(child_stack=0xc820035fc0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 21929 clone(child_stack=0, flags=CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWPID|SIGCHLD) = 21930 The clone that corresponds to my new namespace is the last of these three — I can tell that because of the CLONE_NEW* flags that I passed in. So where did the other two come from? Strace of Go’s minimal program To find out, I built a minimal Go program that does nothing. package main func main() { return } Building that and running it under strace… $ go build -o minimal . $ strace ./minimal 2>&1 | grep clone clone(child_stack=0xc820031fc0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 22066 clone(child_stack=0xc820033fc0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 22067 …it appears we will always see two calls to clone whenever we run a Go executable! I think that the CLONE_THREAD flag explains why we don’t see these processes in the output from ps. It sets up the child processes in the same thread group as the parent. From the clone man page: Thread groups were a feature added in Linux 2.4 to support the POSIX threads notion of a set of threads that share a single PID. Internally, this shared PID is the so-called thread group identifier (TGID) for the thread group. Since Linux 2.4, calls to (2) return the TGID of the caller. getpid But why? So I’ve proven to myself that we get these extra processes – or perhaps we should just call them threads – when we run a Go executable, but I haven’t explained why (I did say at the top that I had only sort-of figured it out!) The full output from strace suggests that it’s something to do with signal handling — there are a lot of calls to rt_sigaction, rt_sigprocmask and sigaltstack before we get the process that really does the work. Perhaps it’s related to Go’s concurrency handling? : pointed out that one of these threads will be Go’s garbage collector. Edit Phil Pearl Know more? I’d love to hear about it! 💚_. Thanks!_ Let me know if you found this helpful or interesting by hitting the recommend button