I am trying to add a new Data Center to an existing 3 node scylla cluster.
The older DC and the new DC are at different location (connected via IPsec).
I have successfully joined the new DC and is UN.
The problem arises while running nodetool rebuild
command on the new node.
It runs smoothly for a while but gets stuck at a certain point and the process stops suddenly.
What could be the possible cause for this issue and how can i fix it ?
sudo nodetool status
displays all nodes Up and Normal
Version : 5.4.4-0.20240228.58a1be93b212
Logs from journalctl :
get_row_diff_with_rpc_stream_handler from=192.168.10.11, repair_meta_id=6060: seastar::nested_exception: seastar::rpc::clos>
[shard 10:stre] rpc - client 192.168.10.11:56965: server connection dropped: recv: Connection timed out
[shard 2:stre] rpc - client 192.168.10.11:60587: server connection dropped: recv: Connection timed out
[shard 10:stre] rpc - client 192.168.10.11:63025: server connection dropped: recv: Connection timed out
[shard 2:stre] rpc - client 192.168.10.11:65042: server connection dropped: recv: Connection timed out
[shard 4:stre] repair - Failed to process get_row_diff_with_rpc_stream_handler from=192.168.10.11, repair_meta_id=6027: seastar::nested_exception: seastar::rpc::clos>
[shard 0:stre] repair - Failed to process get_row_diff_with_rpc_stream_handler from=192.168.10.11, repair_meta_id=6008: seastar::nested_exception: seastar::rpc::clos>
[shard 9:stre] repair - Failed to process get_row_diff_with_rpc_stream_handler from=192.168.10.11, repair_meta_id=6009: seastar::nested_exception: seastar::rpc::clos>
[shard 1:stre] repair - Failed to process get_row_diff_with_rpc_stream_handler from=192.168.10.11, repair_meta_id=6002: seastar::nested_exception: seastar::rpc::clos>
[shard 3:stre] repair - Failed to process get_row_diff_with_rpc_stream_handler from=192.168.10.11, repair_meta_id=6035: seastar::nested_exception: seastar::rpc::clos>
[shard 14:stre] repair - Failed to process get_row_diff_with_rpc_stream_handler from=192.168.10.11, repair_meta_id=6025: seastar::nested_exception: seastar::rpc::clos>
[shard 8:stre] rpc - client 192.168.10.11:51413: server connection dropped: recv: Connection timed out
[shard 6:stre] rpc - client 192.168.10.11:49731: server connection dropped: recv: Connection timed out
[shard 9:stre] rpc - client 192.168.10.11:57324: server connection dropped: recv: Connection timed out
[shard 9:stre] repair - Failed to process get_row_diff_with_rpc_stream_handler from=192.168.10.11, repair_meta_id=6037: seastar::nested_exception: seastar::rpc::clos>
[shard 14:stre] rpc - client 192.168.10.11:59324: server connection dropped: recv: Connection timed out
[shard 7:stre] rpc - client 192.168.10.11:49627: server connection dropped: recv: Connection timed out
[shard 13:stre] rpc - client 192.168.10.11:59263: server connection dropped: recv: Connection timed out
[shard 3:stre] repair - Failed to process get_row_diff_with_rpc_stream_handler from=192.168.10.11, repair_meta_id=6052: seastar::nested_exception: seastar::rpc::clos>
[shard 14:stre] rpc - client 192.168.10.11:55484: server connection dropped: recv: Connection reset by peer
[shard 3:stre] rpc - client 192.168.10.11:55953: server connection dropped: recv: Connection reset by peer
[shard 3:stre] rpc - client 192.168.10.11:49698: server connection dropped: recv: Connection reset by peer
[shard 5:stre] rpc - client 192.168.10.11:54500: server connection dropped: recv: Connection reset by peer
[shard 11:stre] rpc - client 192.168.10.11:53981: server connection dropped: recv: Connection reset by peer
[shard 7:stre] rpc - client 192.168.10.11:59392: server connection dropped: recv: Connection reset by peer
[shard 13:stre] rpc - client 192.168.10.11:51928: server connection dropped: recv: Connection reset by peer
[shard 13:stre] rpc - client 192.168.10.11:60148: server connection dropped: recv: Connection reset by peer
[shard 0:stre] rpc - client 192.168.10.11:57195: server connection dropped: recv: Connection reset by peer
[shard 2:stre] rpc - client 192.168.10.11:64142: server connection dropped: recv: Connection reset by peer
[shard 5:stre] rpc - client 192.168.10.11:63395: server connection dropped: recv: Connection reset by peer
[shard 4:stre] rpc - client 192.168.10.11:57259: server connection dropped: recv: Connection reset by peer
[shard 3:stre] rpc - client 192.168.10.11:52008: server connection dropped: recv: Connection reset by peer
[shard 0:stre] gossip - failure_detector_loop: Send echo to node 192.168.10.11, status = failed: seastar::rpc::closed_error (connection is closed)
[shard 0:goss] gossip - Fail to send EchoMessage to 192.168.10.11: seastar::rpc::closed_error (connection is closed)
[shard 0:stre] gossip - failure_detector_loop: Send echo to node 192.168.10.11, status = failed: seastar::rpc::timeout_error (rpc call timed out)