Originally from the User Slack
@Dominik_Mankowski: Is it normal, that during replace operation (https://opensource.docs.scylladb.com/branch-5.4/operating-scylla/procedures/cluster-management/replace-dead-node.html), a node that is replacing the dead node (the new node has the same IP as the old one) has status (nodetool status
) UN, even though it hasn’t synced all the data? i.e. nodetool status
on the new node looks like this (all the other nodes in the cluster also report this node as UN, while I had expected UJ status):
root@scylladb-drp-test-p-0:~# nodetool status
Datacenter: az_we_dc1
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 172.26.108.8 ? 256 ? fafffa99-40ab-49f4-9c3e-864cb4fd1ad3 rack1
UN 172.26.108.6 ? 256 ? 7129e789-afc4-4211-ba32-57cf776e4550 rack1
UN 172.26.108.7 ? 256 ? 39d14830-9a6c-42c1-917b-97abbd6edc0a rack1
UN 172.26.108.4 ? 256 ? a2018d48-80e7-4b3b-b65e-f16bedb648f2 rack1
UN 172.26.108.5 0 bytes 256 ? null rack1
Scylla 5.4.9
@avi: @Kamil_Braun do you know?
@Felipe_Cardeneti_Mendes: It is normal because the node isn’t joining. It was dead (DN), and now you are replacing it with another. So think about it this way: The node has already joined, you are bringing it back up
@Dominik_Mankowski: > It is normal because the node isn’t joining.
This is in contradiction to what the metrics/dashboard did show (that node was reported as Joining
)
@Felipe_Cardeneti_Mendes: Well — then that’s a monitoring issue — probably replacing would be more accurate
@Kamil_Braun: in short: don’t use replace-with-same-IP, because it’s dangerous and has a bunch of these stupid quirks
one recent issue we found with replace-with-same-IP: https://github.com/scylladb/scylladb/issues/19975
GitHub: Failure during replace-with-same-IP leaves the node without STATUS
application_state (permanently), and token_metadata
inconsistent (until restart) (applies to gossiper / “node-ops” based topology changes) · Issue #19975 · scylladb/scylladb
well, if it completes, then it will be fine
but generally, try avoiding it, use replace-with-different-IP instead
@Dominik_Mankowski: @Kamil_Braun thanks for the hint. Would it be ok if we first removed a dead node (nodetool removenode
) from the cluster (scale in), and then just simply add it to the cluster (scale out), with the same IP?
@Kamil_Braun: yes, but it will take 2x much time as replace, data streaming phase will have to be done twice
even more since at the end you should run cleanup (and IIRC cleanup is not really necessary if you use replace. But it is if you use remove + add)
@Dominik_Mankowski: got it, thanks