Cannot remove unreachable dead nodes from my cluster

Two nodes in my ScyllaDB cluster (which has total 4 nodes) went down because of corrupted storage. One of them was the seed node. I did a rolling restart changing the seed node among the remaining two available nodes.

But now I’m unable to remove the dead nodes from cluster. nodetool removenode command gets stuck indefnitely.

This is my nodetool status:

Datacenter: ap-south
====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns    Host ID                               Rack
UN  172.31.75.191  103.87 GB  256          ?       26a15884-35ee-4f05-a08b-5fb52f524de6  1a
UN  172.31.75.205  92.92 GB   256          ?       d12af5c3-1cb4-43ae-b605-e0da88d6abcd  1a
DN  172.31.75.244  ?          256          ?       ab5438e7-7729-4d14-9e9e-d84459525543  1a
DN  172.31.75.145  ?          256          ?       eb4e084e-f61b-4ace-9250-c2c52aec1b13  1a

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

I ran this command:

nodetool removenode --ignore-dead-nodes ab5438e7-7729-4d14-9e9e-d84459525543 eb4e084e-f61b-4ace-9250-c2c52aec1b13

scylla logs are filled with these errors:

Feb 05 16:17:42 ip-172-31-75-205 scylla[9011]:  [shard 0] cdc - Could not update CDC description table with generation (2023/08/30 16:58:04, 6b722d16-8663-48d2-abc9-6ee7a7b7fc29): exceptions::unavailable_exception (Cannot achieve consistency level for cl QUORUM. Requires 1, alive 0). Will try again.
Feb 05 16:18:40 ip-172-31-75-205 scylla[9011]:  [shard 0] service_level_controller - update_from_distributed_data: failed to update configuration for more than  2160 seconds : exceptions::unavailable_exception (Cannot achieve consistency level for cl ONE. Requires 1, alive 0)

I’m unable to add any new nodes to my cluster because gossip says it can’t add new nodes until the status of any node is UNKNOWN

Is there a way I can forcefully remove the node from my cluster?

I see this was discussed also in #10292, in comments 1, 2 and 3.

I will copy the solution here, for search-ability:

Asias:

First, when a node is unreachable. It is much better to run the replace a dead procedure than running removenode. When more than one nodes are down, you can use ignore_dead_nodes_for_replace option to ignore the peer down node when running replace. With replace, you can add 2 more nodes back.

jarPotato:

I tried replace node procedure with the two config params you mentioned, but faced an error that looked like #13865
But debugging that issue nudged me towards the Raft manual recovery procedure guide, using which I was able to recover my cluster! (removenode worked when all UN nodes where in recovery mode)

Thank you helping me out @asias!

1 Like