Originally from the User Slack
@Aleksander_Karoński: Hi!
I’m trying to upgrade my scylla kubernetes setup and in the process I encountered a problem
I deployed scyllaCluster with following placement
setup (in a comment below). The setup consists of 3 separate nodes and each node contains one scylla instance. I am separating them by nodeAffinities, podAntiAffinities and tolerations. It worked on older operator version, but now there are some cleanup jobs created with exact same affinities as defined in scyllaCluster(?). Those jobs cannot be scheduled because podAntiAffinity rule prevents it.
Do you have any recommendations how to fix such issue?
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: scylladb
operator: In
values:
- default
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
scylla/cluster: default
topologyKey: <http://kubernetes.io/hostname|kubernetes.io/hostname>
tolerations:
- effect: NoSchedule
key: scylladb
operator: Equal
value: default
@Maciej_Zimnoch: Instead using requiredDuringSchedulingIgnoredDuringExecution
in podAntiAffinity
you can use preferredDuringSchedulingIgnoredDuringExecution
which should solve it.
@Aleksander_Karoński: Yes, but is there any guarantee that job won’t be scheduled for instance-0 before instance-2 is scheduled? This could lead to 2 instances being placed on the same node (because job have taken a free spot first)
@Maciej_Zimnoch: cleanup jobs are created after scaling is finished, so cleanup jobs cannot prevent Scylla Pods from being scheduled
@Aleksander_Karoński: Thanks, that’s what I wanted to hear, because I cannot find any docs about it (if such exist, please link here )
@Maciej_Zimnoch: You can read entire proposal for running cleanup here: https://github.com/scylladb/scylla-operator/blob/master/enhancements/proposals/1207-cleanup-after-scaling/README.md