Originally from the User Slack
@Patryk_Kandziora: Hi. I have a question about spinning 3 node scylla cluster on AWS using autoscaling. As for now I spin 1st node with following user_data
{
"scylla_yaml": {
"cluster_name": "${cluster_name}",
"experimental": false,
"endpoint_snitch": "Ec2Snitch"
}
}
I use as well lambda and cloudwatch events to create new A record in route53. Thats works fine. Rest of the nodes in the cluster are spin with following user_data JSON
{
"scylla_yaml": {
"cluster_name": "${cluster_name}",
"experimental": false,
"seed_provider":[
{
"class_name": "org.apache.cassandra.locator.SimpleSeedProvider",
"parameters": [
{
"seeds": "${seeds}"
}
]
}
],
"endpoint_snitch": "Ec2Snitch"
}
}
where the seeds is the mentioned A record with IP address of the 1 st node. All good so far.
Problem:
If the 1st node goes down - the ASG (auto-scalling) will spin new EC2 and lambda will update the A record but the node2 and 3 are not checking the A record anymore. The node1 user_data is not pointing to the A record so cannot see the other nodes in the cluster. In this scenario I have 2 clusters now - node1 as stand alone cluster and rest of the nodes as cluster 2.
I tried to create the A record for 1st node with 127.0.0.1 IP and then remove it by lambda with 1st run but the node1 is just send errors about problem with connecting to 127.0.0.1
Any ideas how to approach this scenario?
@Felipe_Cardeneti_Mendes: You want node1 to point to node2/node3 as seeds so that node1 is able to fetch your cluster’s topology information. That should be sufficient for it to start replacing the previously failed node1
@Patryk_Kandziora: Hi @Felipe_Cardeneti_Mendes - thats the problem here. With ASG I cannot do it because the bootstrap script for 1st node have no one to point to yet. node2 and node3 are not UP yet so there is not IP addresses and with ASG I cannot specify what IP will be there on specific VM - its random based on subnet and subnet mask. Chicken and Egg or catch22 looks like for me
@Felipe_Cardeneti_Mendes: Well, that’s the problem with ASG. Why not use our Operator instead? Nonetheless, that’s just one corner case you will have to handle. You could bootstrap the cluster with multiple seed nodes where n1 would be the one with the lowest IP address, then during node1 replacement it would perform a shadow round with all entries as within your n1 script
@Patryk_Kandziora: Dont have as much time to go into operator now. All load tests performed on Scylla VM. Didn’t know it will be as big problem really. Will check the alternator - didn’t know about it. Thx
@Felipe_Cardeneti_Mendes: ah ignore the above link, that’s was meant to another thread
that’s the operator one https://operator.docs.scylladb.com/
@Patryk_Kandziora: Yeah just reading about the alternator and waiting for info about the IP solution
No - operator is not an option now - its very late for us at this point. Need to figure out how to approach the ASG - might overwrite the bootstrap script after the initial deployment of node1 and add the seeds option or spin all in the same time and see what happens. Is there retry to read the records provided in seeds when is missing? Option to change that behave or to add delay?
@Felipe_Cardeneti_Mendes: yeah, if a seed is unavailable it will eventually retry reading it. keep in mind that any changes to your seed configuration (ie: adding, removing, changing entries) will require a node rolling restart
@Patryk_Kandziora: My thinking is - if I add manager to the cluster it will take care of that? Or am I wrong here?
@Felipe_Cardeneti_Mendes: it won’t
@Patryk_Kandziora: Looks like in long run operator will have to be the option here or further scripting based on cloudwatch events and lambdas.
@Felipe_Cardeneti_Mendes you mentioned rolling restart
Did you mean I need to trigger on each node sudo systemctl restart scylla-server
If so I still can see the old not existing node1 IP addresses as DN
from nodetool status
It looks like that restarting the scylla-server is not fetching IP addresses from the updated A record. What do I miss here?
@Felipe_Cardeneti_Mendes: whenever you update the seed list, yes. Are you using a DNS entry instead of an IP address? There are internal system tablets (eg system.peers
) which will hardcode the IP address of a node, so it won’t be reflected in nodetool status
until you replace the node with the new one (which then will update the internal tables to reflect the new node)
@Patryk_Kandziora: Yes I use DNS entry (aggregated) with IP addresses of all scylla nodes. So each time new instance is in running state the cloudwatch events will trigger lambda and update the IP addresses in that record.
@Felipe_Cardeneti_Mendes: that’s fine, just replace the old failed node and then the entry should revert to UN
@Patryk_Kandziora: but the new node will have different IP - just restarted the scylla-server
service on the new node, updated the scylla.yaml
seed_provider: re-pointing it to the aggregated DNS. The node1 is not joining the cluster - neither any update on node2 and node3. hmm….
Yeah so you cannot add new node if there is DN
in the cluster. You need to removenode
1st then restart scylla-server
on the new node.
Regarding the ASG and node01 issue - I finished with overwriting the user_data in the 1st node as additional step and that fix the issue. Still having the issue with removing the down node from the cluster before I can add new node. Its not a brilliant approach but it works. Would be better going with EC2 instead of ASG I think.
Question - with replication set to 2 and cluster with 3 nodes - how many nodes can go down to be still operational? I assume just one right?
3 nodes - 1 down
5 nodes - 2 down
7 nodes - 3 down
etc.
@Felipe_Cardeneti_Mendes: depends on your consistency level. See https://opensource.docs.scylladb.com/stable/cql/consistency-calculator.html
Consistency Level Calculator | ScyllaDB Docs