Originally from the User Slack
@Mustafa_Shakir: I have two unreachable dead nodes in my scylla cluster.
I want to perform a nodetool removenode on them.
but when I execute: nodetool removenode --ignore-dead-nodes ab5438e7-7729-4d14-9e9e-d84459525543 eb4e084e-f61b-4ace-9250-c2c52aec1b13
where ab5438e7-7729-4d14-9e9e-d84459525543
and eb4e084e-f61b-4ace-9250-c2c52aec1b13
are dead unreachable nodes.
The command get’s stuck and these are the logs I can see:
Feb 05 16:17:42 ip-172-31-75-205 scylla[9011]: [shard 0] cdc - Could not update CDC description table with generation (2023/08/30 16:58:04, 6b722d16-8663-48d2-abc9-6ee7a7b7fc29): exceptions::unavailable_exception (Cannot achieve consistency level for cl QUORUM. Requires 1, alive 0). Will try again.
Feb 05 16:18:40 ip-172-31-75-205 scylla[9011]: [shard 0] service_level_controller - update_from_distributed_data: failed to update configuration for more than 2160 seconds : exceptions::unavailable_exception (Cannot achieve consistency level for cl ONE. Requires 1, alive 0)
This error looks similar to this issue:
https://github.com/scylladb/scylladb/issues/10291
GitHub: service_level_controller - update_from_distributed_data failed to update configuration · Issue #10291 · scylladb/scylladb
If it helps out anyone in the future:
I was able to recover my cluster by following this guide
https://opensource.docs.scylladb.com/stable/architecture/raft.html#raft-manual-recovery-procedure
Raft Consensus Algorithm in ScyllaDB | ScyllaDB Docs