After configuring Linux to run OOM killer early I discovered another problem with the OOM killer: every time compilation started in a shell got killed by OOM killer, the whole terminal disappeared. As I usually run terminal with compilation on another workspace, this resulted in terminal disappearing silently and then I could not find it anywhere without a sign of what happened to the compilation process.
I tried to run compilation in tmux and this resulted in tmux getting killed.
At least the terminal remained with this output:
$ tmux [exited] $
Both the shell and the terminal survived, but tmux got killed. It does not make sense to kill tmux in this case as it does not use a lot of memory. I looked into dmesg output and found this:
[35105.264718] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/app.slice/tmux-spawn-4673b348-e0da-4daa-9ce9-52d3d472133e.scope,task=rustc,pid=251538,uid=1000 [35105.264752] Out of memory: Killed process 251538 (rustc) total-vm:3776868kB, anon-rss:2865196kB, file-rss:4480kB, shmem-rss:0kB, UID:1000 pgtables:6608kB oom_score_adj:200
While previously the process killed was part of /user.slice/user-1000.slice/user@1000.service/app.slice/app-niri-alacritty-106025.scope,
this time it was in /user.slice/user-1000.slice/user@1000.service/app.slice/tmux-spawn-4673b348-e0da-4daa-9ce9-52d3d472133e.scope.
These "scopes" are apparently created by niri and by tmux. I have found the following comment in Niri 25.11:
// When running as a systemd session, we want to put children into their own transient
// scopes in order to separate them from the niri process. This is helpful for
// example to prevent the OOM killer from taking down niri together with a
// misbehaving client.
//
// Putting a child into a scope is done by calling systemd's StartTransientUnit D-Bus method
// with a PID. Unfortunately, there seems to be a race in systemd where if the child exits
// at just the right time, the transient unit will be created but empty, so it will
// linger around forever.
//
// To prevent this, we'll use our double-fork (done for a separate reason) to help. In our
// intermediate child we will send back the grandchild PID, and in niri we will create a
// transient scope with both our intermediate child and the grandchild PIDs set. Only then
// we will signal our intermediate child to exit. This way, even if the grandchild
// exits quickly, a non-empty scope will be created (with just our intermediate
// child), then cleaned up when our intermediate child exits.
Code below the comment calls StartTransientUnit D-Bus API which is part of
systemd D-Bus API.
Code for StartTransientUnit is found in tmux 3.6a
in
systemd_move_to_new_cgroup() function.
Above the call to this function
there is a comment saying "Move the child process into a new cgroup for systemd-oomd isolation.".
This seemingly prevented tmux from being killed,
but as it does not prevent bash from being terminated,
and as tmux had only one pane, tmux exited afterwards.
I was expecting OOM killer to kill a single process,
but apparently the whole "scope" was killed.
As bash does not do D-Bus calls to StartTransientUnit,
when a process started from bash is killed by OOM killer,
bash is killed together with it.
Explanation for this can be found in systemd.service man page.
It describes OOMPolicy option that applies to each unit.
It can be inspected for each unit with a command such as systemctl --user show app-niri-alacritty-159385.scope -p OOMPolicy which prints OOMPolicy=stop.
This value is determined by the DefaultOOMPolicy=
which in turn defaults to stop and means that in case of OOM the whole unit (in this case a scope) is terminated.
One workaround is to start a new scope manually, by running e.g. systemd-run --user --scope cargo build.
Maybe the idea of systemd and cgroups is that app launchers and the shell should be responsible for creating scopes,
but so far bash does not have this level of systemd integration.
I also don't want to run each command using systemd-run manually.
I fixed the problem by creating a file ~/.config/systemd/user.conf.d/oom.conf with the following contents:
[Manager] DefaultOOMPolicy=continue
After running systemctl --user daemon-reload OOMPolicy for all terminal emulator scopes, even the ones already started, has changed:
$ systemctl --user show app-niri-alacritty-159385.scope -p OOMPolicy OOMPolicy=stop $ systemctl --user daemon-reload $ systemctl --user show app-niri-alacritty-159385.scope -p OOMPolicy OOMPolicy=continue
I then tried to run heavy compilation while having low amount of free memory, and it finished with sccache: Compile terminated by signal 9
printed in the terminal without killing the shell and closing the terminal.