Upgrading the ScyllaDB Kubernetes Operator, placement affinities issue with cleanup jobs

Guy · April 17, 2024, 7:14am

Originally from the User Slack

@Aleksander_Karoński: Hi!
I’m trying to upgrade my scylla kubernetes setup and in the process I encountered a problem
I deployed scyllaCluster with following placement setup (in a comment below). The setup consists of 3 separate nodes and each node contains one scylla instance. I am separating them by nodeAffinities, podAntiAffinities and tolerations. It worked on older operator version, but now there are some cleanup jobs created with exact same affinities as defined in scyllaCluster(?). Those jobs cannot be scheduled because podAntiAffinity rule prevents it.
Do you have any recommendations how to fix such issue?
        placement:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
                - matchExpressions:
                    - key: scylladb
                      operator: In
                      values:
                        - default
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchLabels:
                    scylla/cluster: default
                topologyKey: <http://kubernetes.io/hostname|kubernetes.io/hostname>
          tolerations:
            - effect: NoSchedule
              key: scylladb
              operator: Equal
              value: default
@Maciej_Zimnoch: Instead using requiredDuringSchedulingIgnoredDuringExecution in podAntiAffinityyou can use preferredDuringSchedulingIgnoredDuringExecution which should solve it.

@Aleksander_Karoński: Yes, but is there any guarantee that job won’t be scheduled for instance-0 before instance-2 is scheduled? This could lead to 2 instances being placed on the same node (because job have taken a free spot first)

@Maciej_Zimnoch: cleanup jobs are created after scaling is finished, so cleanup jobs cannot prevent Scylla Pods from being scheduled

@Aleksander_Karoński: Thanks, that’s what I wanted to hear, because I cannot find any docs about it (if such exist, please link here )

@Maciej_Zimnoch: You can read entire proposal for running cleanup here: https://github.com/scylladb/scylla-operator/blob/master/enhancements/proposals/1207-cleanup-after-scaling/README.md

Topic	Replies	Views
[RELEASE] Scylla Operator 1.10.0 Release Notes release , operator , operator-release , kubernetes	315	August 28, 2023
[RELEASE] Scylla Operator v1.15.0 Release Notes release , operator-release , kubernetes	92	December 19, 2024
[RELEASE] ScyllaDB Operator 1.8.0 Release Notes operator , operator-release	513	January 30, 2023
[RELEASE] Scylla Operator 1.13.0 Release Notes operator , operator-release , kubernetes	143	June 20, 2024
[RELEASE] Scylla Operator 1.11.1 Release Notes release , operator , operator-release , kubernetes	356	December 28, 2023

Upgrading the ScyllaDB Kubernetes Operator, placement affinities issue with cleanup jobs

Related topics