Disabling or modifying the cpuset configuration, number of cores, CPU core mismatch

Guy · May 15, 2025, 5:24am

Originally from the User Slack

@Vitaly_Ivanov: Hi! I have a question about cpuset configuration. How can I disable or modify it? ScyllaDB fails to start because the scylla-operator assigned cpuset 0-31, but my system requires it to be set to 0-11

@Maciej_Zimnoch: Make sure to follow this guide: https://operator.docs.scylladb.com/stable/architecture/tuning.html
Without staticcpuManagerPolicy and Guaranteed QoS class kubelet cannot assign full exclusive CPUs, so you end up with shares in entire cpu pool.

Tuning | ScyllaDB Docs

@Vitaly_Ivanov: Thank you! Right now, I’m not interested in tuning performance. I just want ScyllaDB to start normally. My processors don’t support cpuset 0-31
ERROR 2025-04-29 11:05:01,906 seastar - Bad value for --cpuset: 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 not allowed. Shutting down.
It supports cpuset 0-11 only. I didn’t change any special settings, just using the default ones. I really wish I could disable cpuset completely so ScyllaDB would just started to work
I found the getCPUsAllowedList() function call at: https://github.com/scylladb/scylla-operator/blob/6c26a631e1a6409dbf59f4cb2a72a56d6cd22882/pkg/sidecar/config/config.go#L217
cpusAllowed, err := getCPUsAllowedList("/proc/1/status")
This function works similarly to this Bash command:
# cat /proc/1/status | grep Cpus_allowed
Cpus_allowed:	ffffffff
Cpus_allowed_list:	0-31
In my case, inside the ScyllaDB pod, it returns 0-31, but my system has 6 cores only (so the maximum should be 0-11). It turns out that the default cpuset detection is incorrect

@Maciej_Zimnoch: well kernel see 32 vcpus, I don’t think it’s wrong. Double check if your Pod lands where you expect

@Vitaly_Ivanov: After reviewing the scylla-operator code, I found that downgrading it to v1.14 is the only viable option as it allows adding cpuset: false configuration to the Kubernetes manifest https://github.com/scylladb/scylla-operator/blob/ddcc1582499b346e01d9750ef30b7f3c75114960/pkg/sidecar/config/config.go#L237C5-L237C24

@Patrick_Bossman: @Vitaly_Ivanov Maciej is the SME in this space. I recommend listening to him

@Vitaly_Ivanov: Nice to meet you
@Patrick_Bossman Thanks!

> Double check if your Pod lands where you expect
Done. I have a 4-node K3s cluster and would like to deploy ScyllaDB on all 4 nodes.
I got an error message in the Scylla container logs
Bad value for --cpuset: 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 not allowed. Shutting down.
Manually setting cpuset=0-11 in /etc/scylla.d/cpuset.conf resolved the startup issue. So I decided to downgrade to v1.14. Now it starts successfully with cpuset: false setting.

My problem is solved, but I think:
• proper cpuset option is important for starting /usr/bin/scylla
• here is some unknown issue with cpuset
I can revert operator to 1.16.2 for debuging

Container started with command
/mnt/shared/scylla-operator sidecar \
        --feature-gates=AllAlpha=false,AllBeta=false,AutomaticTLSCertificates=true \
        --nodes-broadcast-address-type=ServiceClusterIP \
        --clients-broadcast-address-type=ServiceClusterIP \
        --service-name=scylladb-cluster-eu-eu-0 \
        --cpu-count=2 \
        --loglevel=2 \
        --  --developer-mode=1
and it started python script
/docker-entrypoint.py \
        --developer-mode=1 \
        --listen-address=0.0.0.0 --seeds=10.43.205.237 \
        --overprovisioned=1 --smp=2 \
        --prometheus-address=0.0.0.0 --broadcast-address=10.43.205.237 --broadcast-rpc-address=10.43.205.237 \
        --cpuset=0-31
@Maciej_Zimnoch: looks like k3s is doing some nasty stuff, kernel allows process to run on all vCPUs while k3s restricts it to only few. To overcome this, I suggest to follow guide I linked at the beginning, this way Scylla Pods will only use resources they exclusively receive (subset of vcpus) and you shouldn’t hit this issue.
Note that k3s is not supported platform.

@Vitaly_Ivanov: Thank you for you help!

Outside of K3s and in the systemd I see the same CPU core mismatch:
cat /proc/1/status | grep Cpus_allowed
Cpus_allowed:	ffffffff
Cpus_allowed_list:	0-31
cat /proc/cpuinfo | grep 'core id'
core id		: 0
core id		: 1
core id		: 2
core id		: 4
core id		: 5
core id		: 6
core id		: 0
core id		: 1
core id		: 2
core id		: 4
core id		: 5
core id		: 6
12 cores (with Hyper-threading)

12 and 32 are mismatch
I’ve checked a bunch of our servers and it happened only on AMD Ryzen 5 3600 6-Core Processor servers. No ideas what’s wrong with this cpu model

@Patrick_Bossman: @vladzcloudius

@vladzcloudius: Hi. The output above it absolutely normal AFAICT.
The content of /proc/cpuinfo has the actual information about the CPU HW.
The value of the Cpus_allowed from /proc/[pid]/status on the other hand tells on which CPUs the corresponding process (init in your case) is allowed to run:
https://man7.org/linux/man-pages/man5/proc_pid_status.5.html
              Cpus_allowed
                     Hexadecimal mask of CPUs on which this process may
                     run (since Linux 2.6.24, see cpuset(7)).

              Cpus_allowed_list
                     Same as previous, but in "list format" (since Linux
                     2.6.26, see cpuset(7)).
And I couldn’t find anywhere any requirement for the latter mask to include only actually present CPUs.

This means that the user should always use a bitwise AND between the masks from cpuinfo and the Cpus_allowed in order to get the mask of the actual physically present CPU on which the corresponding process is allowed to run.

@Patrick_Bossman: @Maciej_Zimnoch ^^

@Vitaly_Ivanov: What does this mean for scylla-operator? Should it use a bitwise AND between the cpuinfo and Cpus_allowed masks to correctly set --cpuset?

@Maciej_Zimnoch: I would need to check how those constraints look like in container environment, maybe it’s standardized - i’m not sure. So far all major cloud providers I checked fills cpus_allowed into cpus where container is actually able to run, all available when using CFS shares or subset when running on exclusively allocated ones.

Topic	Replies	Views
Operator error message: can't apply statefulset update: can't update ScyllaDB error-message , operator , kubernetes	14	May 11, 2025
[RELEASE] Scylla Operator v1.15.0 Release Notes release , operator-release , kubernetes	90	December 19, 2024
[RELEASE] Scylla Operator v1.14.0 Release Notes release , operator , operator-release , kubernetes	112	September 19, 2024
[RELEASE] ScyllaDB Operator 1.8.0 Release Notes operator , operator-release	512	January 30, 2023
[RELEASE] Scylla Operator 1.9.0 ScyllaDB release , operator , operator-release , kubernetes	242	July 6, 2023

Disabling or modifying the cpuset configuration, number of cores, CPU core mismatch

Related topics