Problem:
The SImA_slurmd.1 container fails to start when the Image Artist stack is deployed following the upgrade to ImA 1.5. Batch analysis will fail.
Error as shown in the SImA_slurmd service log:
2025-12-04T09:36:36+01:00 slurmd: error: Could not create scope directory /sys/fs/cgroup/docker_limit.slice/system.slice/slurmstepd.scope: No such file or directory
2025-12-04T09:36:36+01:00 slurmd: error: Couldn't load specified plugin name for cgroup/v2: Plugin init() callback failed
2025-12-04T09:36:36+01:00 slurmd: error: cannot create cgroup context for cgroup/v2
2025-12-04T09:36:36+01:00 slurmd: error: Unable to initialize cgroup plugin
2025-12-04T09:36:36+01:00 slurmd: error: slurmd initialization failed
Cause:
In Image Artist 1.5, only cgroup V2 is permitted. Setting Docker’s cgroup-parent to a systemd slice (e.g. "docker_limit.slice") can break Slurm’s cgroup/v2 plugin initialization if the node’s systemd/cgroup setup isn’t fully aligned with Slurm’s expectations. Assigning Docker to a custom slice changes where containers live under /sys/fs/cgroup, and if systemd is not delegating/initializing that slice correctly, Slurm will fail while trying to prepare its own scopes/slices.
Solution:
Confirm the contents of the /etc/docker/daemon.json file e.g.
$ cat /etc/docker/daemon.json
{
"default-cgroupns-mode": "host",
"cgroup-parent": "docker_limit.slice"
}
If present, remove the following line from the /etc/docker/daemon.json file :
"cgroup-parent": "docker_limit.slice"
undeploy the ImA stack
stop the docker service:
$ systemctl stop docker.service- start the docker service:
$ systemctl start docker.service - check the status of the docker service to make sure it's running:
$ systemctl status docker.service
- deploy the ImA stack
Comments
0 comments
Article is closed for comments.