In part 2, I'd talked about how when you press Ctrl-Z, the shell sends the SIGTSTP signal to the process that's running.
While that is true, it isn't quite the whole truth. In reality, the shell sends a signal to the entire process group.
What is a process group?
On Unix systems, a process group is a collection of processes. Each process group is created out of an initial or top-most process, and a process group ID (PGID) is the same as the process ID (PID) of the top-most process.
Process groups are most commonly used by shells. Whenever you run a command from a shell, a new process group is created for the command. Here’s some example output for ps fo pid,pgid,comm:
PID PGID COMMAND
16528 16528 zsh
520283 520283 \_ cargo
520359 520283 \_ rustc
520387 520283 \_ rustc
520642 520283 \_ rustc
520644 520283 \_ rustc
In this example, zsh (PID 16528) has created a process group for cargo (PID/PGID 520283). The cargo process has spun up four rustc processes, and each of those has inherited its PGID from the cargo process.
Why do process groups exist?
The main purpose of a process group is to be able to send signals to it atomically. In the above example, if you press Ctrl-C while cargo is running, SIGINT is sent to all of the processes in the process group 520283—this means the cargo process, as well as the four child rustc processes.
For nextest, you'd expect this to mean that when you hit Ctrl-C in the terminal, all child tests terminate, and nextest exits right away rather than having to wait for tests to finish running. Similarly, when you hit Ctrl-Z in the terminal, you'd expect that nextest as well as all child tests receive SIGTSTP and are suspended.
However, that's not what happens. In reality:
- Nextest creates a separate process group for each test. (I have a blog post coming soon for why nextest does this. The tl;dr is that if a test times out, nextest needs to kill the test process as well as any children it starts.)
- Also, each process can only be part of one process group. In other words, process groups don't form a tree.
Here's a snapshot of ps fo pid,pgid,comm while nextest is running:
PID PGID COMMAND
3931685 3931685 zsh
689343 689343 \_ cargo-nextest
690696 690696 \_ process_kill_on
690711 690696 | \_ bash
690730 690696 | \_ sleep
691559 691559 \_ rt_threaded-fbe
691627 691627 \_ rt_threaded-fbe
696315 696315 \_ time_sleep-6107
696905 696905 \_ tokio-4f33ad8bb
In this example, zsh has assigned cargo-nextest a new PGID (689343). In turn, cargo-nextest has assigned each test its own PGID (e.g. 690696). The individual test PGIDs are unrelated to the main nextest PGID.
The overall result of this is that only the nextest process receives the ctrl-Z -- none of the child processes do. But it also suggests a way forward: when cargo-nextest receives a ctrl-Z, it needs to simply forward that to child processes.
Did we shoot ourselves in the foot?
The change to put each test in its process group was made in July, well after nextest first came out. At first glance, it seems like we've unnecessarily made things harder on ourselves by assigning each test its own process group.
That's not quite the case! Even setting aside the benefits of putting each test in its own process group, they add very little complexity on top of what's required anyway, for reasons that we'll go into in future posts.
In the next part of this series, we're going to see what actually happens if you press ctrl-Z while nextest is running.