Hi,
Is there any way to add a new node to the cluster even when one of it’s configured seeds isn’t reachable but the rest are? e.g. typo…
Let’s try it out…
% cat unreachable-seeds.yml
version: "3"
services:
scylla-node1:
container_name: node1
image: scylladb/scylla:5.2.10
restart: always
command: --seeds=node1 --smp 1 --memory 1G --overprovisioned 1 --api-address 0.0.0.0 --endpoint-snitch GossipingPropertyFileSnitch
networks:
- web
volumes:
- ./cassandra-rackdc.properties:/etc/scylla/cassandra-rackdc.properties
healthcheck:
test: ["CMD-SHELL", "sh -c $(curl -s -X GET --header 'Accept: application/json' 'http://localhost:10000/storage_service/native_transport')"]
interval: 30s
timeout: 10s
retries: 5
scylla-node2:
container_name: node2
image: scylladb/scylla:5.2.10
restart: always
command: --seeds=node1 --smp 1 --memory 1G --overprovisioned 1 --api-address 0.0.0.0 --endpoint-snitch GossipingPropertyFileSnitch
networks:
- web
volumes:
- ./cassandra-rackdc.properties:/etc/scylla/cassandra-rackdc.properties
healthcheck:
test: ["CMD-SHELL", "sh -c $(curl -s -X GET --header 'Accept: application/json' 'http://localhost:10000/storage_service/native_transport')"]
interval: 30s
timeout: 10s
retries: 5
depends_on:
scylla-node1:
condition: service_healthy
scylla-node3:
container_name: node3
image: scylladb/scylla:5.2.10
restart: always
command: --seeds=172.17.0.155,172.22.0.156,node2,172.21.0.50,172.21.0.1 --smp 1 --memory 1G --overprovisioned 1 --api-address 0.0.0.0 --endpoint-snitch GossipingPropertyFileSnitch
networks:
- web
volumes:
- ./cassandra-rackdc.properties:/etc/scylla/cassandra-rackdc.properties
healthcheck:
test: ["CMD-SHELL", "sh -c $(curl -s -X GET --header 'Accept: application/json' 'http://localhost:10000/storage_service/native_transport')"]
interval: 30s
timeout: 10s
retries: 5
depends_on:
scylla-node2:
condition: service_healthy
networks:
web:
driver: bridge
In the above configuration, scylla-node3 has 4 invalid seeds: 3 from different subnets and unreachable, one from the local network (which is the GW), and node2 (the only valid entry). Note we haven’t specified an invalid FQDN here, as that would fail the DNS lookup.
% docker compose -f unreachable.yml up -d
[+] Running 2/3
⠿ Container node1 Healthy 32.0s
⠿ Container node2 Healthy 152.7s
⠿ Container node3 Started 153.0s
Then:
% docker exec -it node3 nodetool status
Datacenter: dc
==============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 172.21.0.4 624 KB 256 ? 1c0dff39-e5f2-4960-98c4-e1838f6cf673 r1
UN 172.21.0.3 ? 256 ? 67088527-d288-4ee3-8e0c-7f1af9258669 r1
UN 172.21.0.2 260 KB 256 ? acaf2b17-c66a-4034-a083-644a775a931a r1
Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
Hope it helps!
Thats great, when I originally tried it I failed on DNS lookup which wasn’t obvious from the logs.
Thanks!
I will quickly note for anyone visiting this topic in the future, that this is still a problem in the context of the first node of the cluster.
see this issue.