Originally from the User Slack
@Vitaly_Ivanov: Hi! I have a question about cpuset configuration. How can I disable or modify it? ScyllaDB fails to start because the scylla-operator assigned cpuset 0-31
, but my system requires it to be set to 0-11
@Maciej_Zimnoch: Make sure to follow this guide: https://operator.docs.scylladb.com/stable/architecture/tuning.html
Without static
cpuManagerPolicy and Guaranteed QoS class kubelet cannot assign full exclusive CPUs, so you end up with shares in entire cpu pool.
Tuning | ScyllaDB Docs
@Vitaly_Ivanov: Thank you! Right now, I’m not interested in tuning performance. I just want ScyllaDB to start normally. My processors don’t support cpuset 0-31
ERROR 2025-04-29 11:05:01,906 seastar - Bad value for --cpuset: 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 not allowed. Shutting down.
It supports cpuset 0-11
only. I didn’t change any special settings, just using the default ones. I really wish I could disable cpuset completely so ScyllaDB would just started to work
I found the getCPUsAllowedList()
function call at: https://github.com/scylladb/scylla-operator/blob/6c26a631e1a6409dbf59f4cb2a72a56d6cd22882/pkg/sidecar/config/config.go#L217
cpusAllowed, err := getCPUsAllowedList("/proc/1/status")
This function works similarly to this Bash command:
# cat /proc/1/status | grep Cpus_allowed
Cpus_allowed: ffffffff
Cpus_allowed_list: 0-31
In my case, inside the ScyllaDB pod, it returns 0-31
, but my system has 6 cores only (so the maximum should be 0-11
). It turns out that the default cpuset
detection is incorrect
@Maciej_Zimnoch: well kernel see 32 vcpus, I don’t think it’s wrong. Double check if your Pod lands where you expect
@Vitaly_Ivanov: After reviewing the scylla-operator code, I found that downgrading it to v1.14
is the only viable option as it allows adding cpuset: false
configuration to the Kubernetes manifest https://github.com/scylladb/scylla-operator/blob/ddcc1582499b346e01d9750ef30b7f3c75114960/pkg/sidecar/config/config.go#L237C5-L237C24
@Patrick_Bossman: @Vitaly_Ivanov Maciej is the SME in this space. I recommend listening to him
@Vitaly_Ivanov: Nice to meet you 
@Patrick_Bossman Thanks!
> Double check if your Pod lands where you expect
Done. I have a 4-node K3s cluster and would like to deploy ScyllaDB on all 4 nodes.
I got an error message in the Scylla container logs
Bad value for --cpuset: 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 not allowed. Shutting down.
Manually setting cpuset=0-11
in /etc/scylla.d/cpuset.conf
resolved the startup issue. So I decided to downgrade to v1.14. Now it starts successfully with cpuset: false
setting.
My problem is solved, but I think:
• proper cpuset option is important for starting /usr/bin/scylla
• here is some unknown issue with cpuset
I can revert operator to 1.16.2 for debuging
Container started with command
/mnt/shared/scylla-operator sidecar \
--feature-gates=AllAlpha=false,AllBeta=false,AutomaticTLSCertificates=true \
--nodes-broadcast-address-type=ServiceClusterIP \
--clients-broadcast-address-type=ServiceClusterIP \
--service-name=scylladb-cluster-eu-eu-0 \
--cpu-count=2 \
--loglevel=2 \
-- --developer-mode=1
and it started python script
/docker-entrypoint.py \
--developer-mode=1 \
--listen-address=0.0.0.0 --seeds=10.43.205.237 \
--overprovisioned=1 --smp=2 \
--prometheus-address=0.0.0.0 --broadcast-address=10.43.205.237 --broadcast-rpc-address=10.43.205.237 \
--cpuset=0-31
@Maciej_Zimnoch: looks like k3s is doing some nasty stuff, kernel allows process to run on all vCPUs while k3s restricts it to only few. To overcome this, I suggest to follow guide I linked at the beginning, this way Scylla Pods will only use resources they exclusively receive (subset of vcpus) and you shouldn’t hit this issue.
Note that k3s is not supported platform.
@Vitaly_Ivanov: Thank you for you help!
Outside of K3s and in the systemd I see the same CPU core mismatch:
cat /proc/1/status | grep Cpus_allowed
Cpus_allowed: ffffffff
Cpus_allowed_list: 0-31
cat /proc/cpuinfo | grep 'core id'
core id : 0
core id : 1
core id : 2
core id : 4
core id : 5
core id : 6
core id : 0
core id : 1
core id : 2
core id : 4
core id : 5
core id : 6
12 cores (with Hyper-threading)
12 and 32 are mismatch
I’ve checked a bunch of our servers and it happened only on AMD Ryzen 5 3600 6-Core Processor
servers. No ideas what’s wrong with this cpu model
@Patrick_Bossman: @vladzcloudius
@vladzcloudius: Hi. The output above it absolutely normal AFAICT.
The content of /proc/cpuinfo
has the actual information about the CPU HW.
The value of the Cpus_allowed from /proc/[pid]/status
on the other hand tells on which CPUs the corresponding process (init
in your case) is allowed to run:
https://man7.org/linux/man-pages/man5/proc_pid_status.5.html
Cpus_allowed
Hexadecimal mask of CPUs on which this process may
run (since Linux 2.6.24, see cpuset(7)).
Cpus_allowed_list
Same as previous, but in "list format" (since Linux
2.6.26, see cpuset(7)).
And I couldn’t find anywhere any requirement for the latter mask to include only actually present CPUs.
This means that the user should always use a bitwise AND between the masks from cpuinfo
and the Cpus_allowed
in order to get the mask of the actual physically present CPU on which the corresponding process is allowed to run.
@Patrick_Bossman: @Maciej_Zimnoch ^^
@Vitaly_Ivanov: What does this mean for scylla-operator? Should it use a bitwise AND between the cpuinfo
and Cpus_allowed
masks to correctly set --cpuset
?
@Maciej_Zimnoch: I would need to check how those constraints look like in container environment, maybe it’s standardized - i’m not sure. So far all major cloud providers I checked fills cpus_allowed
into cpus where container is actually able to run, all available when using CFS shares or subset when running on exclusively allocated ones.