Scylla cluster membership issue after failed change

Have a little problem. Trying to learn how to deal with and fix clustering issues in scylla. (currently 5.2.2) I created a cluster and then I added 2 nodes from a different datacenter into it (so currently datacenter1 = 3 nodes, datacenter2 = 2 nodes, both files use local=true) I made a configuration error in the cassandra-rackdc.properties file when I tried to add the third node to datacenter 2 and added the tag as datacenter1 instead. The node wouldn’t add, just stalled. I tried to readd and it first said that it already existed. After trying to figure this out, i wiped the data and tried again and then it gave me the error that it couldn’t resolve ip addresses for node with id and it gave multiple ID’s that weren’t in the cluster. Now I can’t add any nodes. I created a new node, attempted to add it to the cluster and it just froze. I attempted to reboot and try again and now it’s saying that it already exists in the cluster. However, it doesn’t show up in nodetool status or system.peers.

How can I fix this issue so I can add nodes to the cluster?

Hi,

please read the documentation on handling membership change failures, and follow the instructions there: Handling Cluster Membership Change Failures | ScyllaDB Docs

1 Like

Thank you for this. I read through this and at the end it says "If removenode returns an error like:

nodetool: Scylla API server HTTP POST to URL ‘/storage_service/remove_node’ failed: std::runtime_error (removenode[12e7e05b-d1ae-4978-b6a6-de0066aa80d8]: Host ID 42405b3b-487e-4759-8590-ddb9bdcebdc5 not found in the cluster)
and you’re sure that you’re providing the correct Host ID, it means that the member was already removed and you don’t have to clean up after it.

However, Here’s the output from attempts to remove a node.

10.0.137.180
nodetool info
ID : dd530cfb-7f4e-42c3-be6a-ce093d263b96
Gossip active : true
nodetool: Scylla API server HTTP GET to URL ‘/storage_service/rpc_server’ failed: Not found
See ‘nodetool help’ or ‘nodetool help ’.

nodetool decommission
nodetool: Scylla API server HTTP POST to URL ‘/storage_service/decommission’ failed: std::runtime_error (local node is not a member of the token ring yet)
See ‘nodetool help’ or ‘nodetool help ’.

10.0.130.77 (one of the nodes in the cluster)
nodetool describecluster
Cluster Information:
Name: Veeps
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
DynamicEndPointSnitch: disabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
6e0ec14b-1d4b-305b-9bf9-d420da03eb45: [10.0.137.180]
7f3ff0e1-96d7-3145-8883-f6125b5522f6: [10.0.130.77, 10.0.137.241]

nodetool removenode dd530cfb-7f4e-42c3-be6a-ce093d263b96
nodetool: Scylla API server HTTP POST to URL ‘/storage_service/remove_node’ failed: std::runtime_error (removenode[bf9d0aed-d300-48e0-ad70-8637d11972d6]: Node dd530cfb-7f4e-42c3-be6a-ce093d263b96 not found in the cluster)
See ‘nodetool help’ or ‘nodetool help ’.

I see you marked my answer as solution, does the problem still persist or have you managed to solve it?

I see you tried to run nodetool decommission – on which node, and why? (The error you got is suspicious)

Anyway if you’d like us to proceed with investigation then please post the output of:

  • nodetool status
  • select * from system.cluster_status executed on one of the nodes with CQL (e.g. cqlsh)

I didn’t mark it as a solution. Someone else did.

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
– Address Load Tokens Owns Host ID Rack
UN 10.0.137.241 124.76 MB 256 ? 08f06ec2-d35d-444c-afa6-6566b0eada57 rack1
UN 10.0.130.77 123.49 MB 256 ? 0b7f7c40-ba5c-42c1-94f2-207aca6facbe rack1

peer | dc | host_id | load | owns | status | tokens | up
----------------±------------±-------------------------------------±------------±---------±--------±-------±------
10.250.135.89 | null | null | 2.82419e+06 | null | LEFT | 0 | False
10.250.141.220 | null | null | 3.0423e+06 | null | LEFT | 0 | False
10.0.130.77 | datacenter1 | 0b7f7c40-ba5c-42c1-94f2-207aca6facbe | 1.29489e+08 | 0.500783 | NORMAL | 256 | True
10.0.137.241 | datacenter1 | 08f06ec2-d35d-444c-afa6-6566b0eada57 | 1.30818e+08 | 0.499217 | NORMAL | 256 | True
10.0.137.180 | null | null | | null | UNKNOWN | 0 | True

ok so it looks like actually stopping scylla-server on .180 removed it from that list. However what’s your thoughts on the other 2 listed? Those were already offline.

You mean .89 and .220? They are shown in LEFT status. Expected if you removed them.

So now do I understand correctly that you have 2 nodes in the cluster, both in datacenter1/rack1?

Sorry, I forgot to ask before – can you also provide output of select * from system.raft_state?

cassandra@cqlsh> select * from system.raft_state;

group_id | disposition | server_id | can_vote
----------±------------±----------±---------

(0 rows)

and yes, 2 nodes, both in datacenter1/rack1.

Hm, I was convinced that you have Raft enabled in your cluster (due to the error you said you were getting – that it cannot resolve IP address), but apparently not.

Do both nodes return this result when you connect to cqlsh with them? (Empty system.raft_state table)

What is your conf/scylla.yaml? Do you have the consistent_cluster_management flag set there?

Yeah, that’s part of where I was confused as well. I have consistent_cluster_management set to true on both nodes and has been from day 1. Both nodes return that result. Another thing to mention is this query :select value from system.scylla_local where key = ‘raft_group0_id’; doesn’t return anything either because that key doesn’t exist in scylla_local. That comes from the link you sent me earlier in the post.

Suspicious indeed.

You haven’t done the “manual Raft recovery procedure”, have you?

What is the result of SELECT * FROM system.scylla_local WHERE key = 'group0_upgrade_state'; on each node?

Ok, now i feel a little dumb. I went into recovery mode night before last when trying to fix this. I have backed out of this and now this is the result of that query:

key | value
----------------------±-------------------------
group0_upgrade_state | use_post_raft_procedures

WDYM by “backed out”? Have you finished the recovery procedure? If you did anything from the procedure after “enter recovery mode step” (like truncated some Raft tables), then you need to finish it. If you only entered recovery mode, but then set group0_upgrade_state back to use_post_raft_procedures without removing any other data etc., then I guess everything should be fine.

So if you did anything else besides just entering recovery mode – you should finish the recovery procedure (starting by going into recovery mode again), and then proceed.

Otherwise we should be able to proceed now.

If we proceed, then the next step is to check the results of select * from system.raft_state again.

Yeah, the only thing I did was enter recovery mode. I didn’t truncate any tables.

group_id | disposition | server_id | can_vote
--------------------------------------±------------±-------------------------------------±---------
5d63f9a0-ede9-11ee-8462-8160e0cdd5d0 | CURRENT | 08f06ec2-d35d-444c-afa6-6566b0eada57 | True
5d63f9a0-ede9-11ee-8462-8160e0cdd5d0 | CURRENT | 0b7f7c40-ba5c-42c1-94f2-207aca6facbe | True

Is now the result of raft_state

So looks like nodetool status is consistent with raft state, there are no “ghost members” anymore.

Try to boot the new node again. Make sure you don’t use old work directories from previous boot attempts though – clear the old data from node which failed to boot if you do it on the same machine.

If it again gets stuck on “fail to resolve IP” please save the host ID, we’ll have to determine what node this host ID belongs to.

Thank you! I will try this now.

Ok, did the following (noting it here for posterity)

Removed all the data from 10.0.137.180
Ensured cassandra-rackdc.properties was set to datacenter1/rack1
Ensured the seeds were correct in scylla.yaml and all the parameters matched (consistent_cluster_management, etc)
Started scylla

Now the node has successfully joined datacenter1. I am now going to create 3 new nodes and attempt to join datacenter2 to this cluster.

1 Like

Just make sure you do the joining sequentially (like it’s written in the 5.2 docs.)