Originally from the User Slack
@Bohdan_Smal: Hello.
I wanted to clarify a few basic questions. In the Production Readiness Guidelines, it’s stated that “All configuration settings for all nodes in the same cluster should be identical or coherent.” However, I couldn’t find additional description of parameters indicating whether they are coherent or not. There’s only information about whether liveness updates are supported or not. For example, on the production cluster with 6 nodes (Ec2Snitch, 2 nodes in each zone), I need to enable additional metrics by adding the “enable_keyspace_column_family_metrics parameter” to the configuration. Most likely, there shouldn’t be any issues with this setting, but I still wanted to clarify. Additionally, I wanted to ask if it’s sufficient to simply sequentially execute “sudo systemctl start scylla-server.service” on each node after adding this parameter to the configuration, or if it would be more reliable to first execute “nodetool drain” and then restart the scylladb service. These questions may seem trivial, but I wanted to clarify with you. Thank you.
Production Readiness Guidelines | ScyllaDB Docs
Configuration Parameters | ScyllaDB Docs
@Felipe_Cardeneti_Mendes: The warning idea is to prevent configuration drifts. Like having enable_keyspace_column_family_metrics
enabled in just one node and not across other nodes in the cluster.
Of course, you can test specific settings under a single node before deploying it globally. But it’s a good idea to always employ some way (such as ansible), to ensure that the configuration is consistent across the board eventually.
It is a good practice to drain the node first, then restart.
@Bohdan_Smal: @Felipe_Cardeneti_Mendes
Please, could you provide more details on what you meant? The cluster is under load, so to avoid downtime, I’ll sequentially update the configuration on each node, drain the node, and restart the Scylla service to bring the node back into the cluster. Therefore, for a certain period, the configuration on the nodes will differ, but approximately within 10 minutes, when I execute my commands on each node in sequence, the configuration will be the same everywhere. My main goal is to avoid downtime or any errors on the ScyllaDB side, as requests will continue to be processed during this time.
@Felipe_Cardeneti_Mendes: Your flow is correct.