Issues running with ipv6

I’m trying to run scylladb on fly.io and running into issues around ipv6.

I have the following custom Dockerfile:

FROM scylladb/scylla:5.2

ADD ./cassandra-rackdc.properties.dc1 /etc/scylla/cassandra-rackdc.properties

ENTRYPOINT /docker-entrypoint.py --seeds "${SEEDS}" --smp "${SMP}" --memory "${MEM}" --overprovisioned "${OVERPROV}" --api-address "0.0.0.0" --listen-address "${ADV_ADDR}" --rpc-address "${ADV_ADDR}" --alternator-address "${ADV_ADDR}"

And I am using the following to create machines with the machines api:

POST https://api.machines.dev/v1/apps/scylla-db/machines/4d891407f20787
Content-Type: application/json
Authorization: Bearer xxxx

{
  "name": "scylla-1",
  "config": {
    "image": "registry.fly.io/scylla-db:5.2v6",
    "size": "performance-2x",
    "env": {
      "SEEDS": "32874e1dc11785.vm.scylla-db.internal,4d891407f20787.vm.scylla-db.internal",
      "SMP": "2",
      "MEM": "2G",
      "OVERPROV": "1",
      "API_ADDR": "0.0.0.0",
      "ADV_ADDR": "4d891407f20787.vm.scylla-db.internal"
    },
    "services": [
      {
        "ports": [
          {
            "port": 9160
          }
        ],
        "protocol": "tcp",
        "internal_port": 9160
      },
      {
        "ports": [
          {
            "port": 9042
          }
        ],
        "protocol": "tcp",
        "internal_port": 9042
      },
      {
        "ports": [
          {
            "port": 10000
          }
        ],
        "protocol": "tcp",
        "internal_port": 10000
      },
      {
        "ports": [
          {
            "port": 9142
          }
        ],
        "protocol": "tcp",
        "internal_port": 9142
      }
    ]
  }
}

Finally, when I run netstat -l on the machine I see:

root@32874e1dc11785:/# netstat -l
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 localhost:9001          0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
tcp6       0      0 [::]:33047              [::]:*                  LISTEN
tcp6       0      0 localhost:7199          [::]:*                  LISTEN
tcp6       0      0 [::]:9100               [::]:*                  LISTEN
tcp6       0      0 fly-local-6pn:22        [::]:*                  LISTEN
udp        0      0 0.0.0.0:39212           0.0.0.0:*
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   Path

As well as nodetool having issues:

root@32874e1dc11785:/# nodetool status
nodenodetool: Unable to connect to Scylla API server: java.net.ConnectException: Connection refused (Connection refused)
See 'nodetool help' or 'nodetool help <command>'.
root@32874e1dc11785:/# nodetool -h "::1" status
nodetool: Failed to connect to '::1:7199' - ConnectException: 'Connection refused (Connection refused)'.

Related logs:

Logs
2023-05-20T19:51:50Z app[32874e1dc11785] ewr [info]Starting init (commit: 15f6405)...
2023-05-20T19:51:50Z app[32874e1dc11785] ewr [info]Preparing to run: `/bin/sh -c /docker-entrypoint.py --seeds "${SEEDS}" --smp "${SMP}" --memory "${MEM}" --overprovisioned "${OVERPROV}" --api-address "0.0.0.0" --listen-address "${ADV_ADDR}" --rpc-address "${ADV_ADDR}" --alternator-address "${ADV_ADDR}"` as root
2023-05-20T19:51:50Z app[32874e1dc11785] ewr [info]2023/05/20 19:51:50 listening on [fdaa:1:e6d1:a7b:15f:f165:d3d4:2]:22 (DNS: [fdaa::3]:53)
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]running: (['/opt/scylladb/scripts/scylla_dev_mode_setup', '--developer-mode', '1'],)
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]running: (['/opt/scylladb/scripts/scylla_io_setup'],)
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,667 CRIT Supervisor is running as root.  Privileges were not dropped because no user is specified in the config file.  If you intend to run as root, you can set user=root in the config file to avoid this message.
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,667 INFO Included extra file "/etc/supervisord.conf.d/rsyslog.conf" during parsing
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,667 INFO Included extra file "/etc/supervisord.conf.d/scylla-housekeeping.conf" during parsing
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,667 INFO Included extra file "/etc/supervisord.conf.d/scylla-jmx.conf" during parsing
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,667 INFO Included extra file "/etc/supervisord.conf.d/scylla-node-exporter.conf" during parsing
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,668 INFO Included extra file "/etc/supervisord.conf.d/scylla-server.conf" during parsing
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,668 INFO Included extra file "/etc/supervisord.conf.d/sshd-server.conf" during parsing
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,672 INFO RPC interface 'supervisor' initialized
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,673 CRIT Server 'inet_http_server' running without any HTTP authentication checking
2023-05-20T19:51:51Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:51,673 INFO supervisord started with pid 549
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:52,676 INFO spawned: 'rsyslog' with pid 551
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:52,678 INFO spawned: 'scylla' with pid 552
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:52,680 INFO spawned: 'scylla-housekeeping' with pid 553
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:52,682 INFO spawned: 'scylla-jmx' with pid 555
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:52,684 INFO spawned: 'scylla-node-exporter' with pid 556
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:52,686 INFO spawned: 'sshd' with pid 558
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.773Z caller=node_exporter.go:182 level=info msg="Starting node_exporter" version="(version=1.4.0, branch=HEAD, revision=7da1321761b3b8dfc9e496e1a60e6a476fec6018)"
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.774Z caller=node_exporter.go:183 level=info msg="Build context" build_context="(go=go1.19.1, user=root@83d90983e89c, date=20220926-12:32:56)"
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.774Z caller=node_exporter.go:185 level=warn msg="Node Exporter is running as root user. This exporter is designed to run as unprivileged user, root is not required."
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.775Z caller=filesystem_common.go:111 level=info collector=filesystem msg="Parsed flag --collector.filesystem.mount-points-exclude" flag=^/(dev|proc|run/credentials/.+|sys|var/lib/docker/.+|var/lib/containers/storage/.+)($|/)
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.775Z caller=filesystem_common.go:113 level=info collector=filesystem msg="Parsed flag --collector.filesystem.fs-types-exclude" flag=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.775Z caller=diskstats_common.go:100 level=info collector=diskstats msg="Parsed flag --collector.diskstats.device-exclude" flag=^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\d+n\d+p)\d+$
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=diskstats_linux.go:264 level=error collector=diskstats msg="Failed to open directory, disabling udev device properties" path=/run/udev/data
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:108 level=info msg="Enabled collectors"
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=arp
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=bcache
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=bonding
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=btrfs
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=conntrack
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=cpu
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=cpufreq
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=diskstats
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=dmi
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=edac
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=entropy
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=fibrechannel
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=filefd
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=filesystem
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=hwmon
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=infiniband
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=interrupts
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=ipvs
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=loadavg
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=mdadm
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=meminfo
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=netclass
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=netdev
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=netstat
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=nfs
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=nfsd
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=nvme
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=os
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=powersupplyclass
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=pressure
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=rapl
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=schedstat
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=selinux
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=sockstat
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=softnet
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=stat
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=tapestats
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=textfile
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=thermal_zone
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=time
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=timex
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=udp_queues
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=uname
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=vmstat
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=xfs
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:115 level=info collector=zfs
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.776Z caller=node_exporter.go:199 level=info msg="Listening on" address=:9100
2023-05-20T19:51:52Z app[32874e1dc11785] ewr [info]ts=2023-05-20T19:51:52.777Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]Scylla version 5.2.1-0.20230508.f1c45553bc29 with build-id 88ac66b1719cc7c5b7e982aa34ba5dc95909b84a starting ...
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]command used: "/usr/bin/scylla --log-to-syslog 0 --log-to-stdout 1 --default-log-level info --network-stack posix --developer-mode=1 --memory 2G --smp 2 --overprovisioned --listen-address 32874e1dc11785.vm.scylla-db.internal --rpc-address 32874e1dc11785.vm.scylla-db.internal --seed-provider-parameters seeds=32874e1dc11785.vm.scylla-db.internal,4d891407f20787.vm.scylla-db.internal --api-address 0.0.0.0 --alternator-address 32874e1dc11785.vm.scylla-db.internal --blocked-reactor-notify-ms 999999999"
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]parsed command line options: [log-to-syslog, (positional) 0, log-to-stdout, (positional) 1, default-log-level, (positional) info, network-stack, (positional) posix, developer-mode: 1, memory, (positional) 2G, smp, (positional) 2, overprovisioned, listen-address: 32874e1dc11785.vm.scylla-db.internal, rpc-address: 32874e1dc11785.vm.scylla-db.internal, seed-provider-parameters: seeds=32874e1dc11785.vm.scylla-db.internal,4d891407f20787.vm.scylla-db.internal, api-address: 0.0.0.0, alternator-address: 32874e1dc11785.vm.scylla-db.internal, blocked-reactor-notify-ms, (positional) 999999999]
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]WARN  2023-05-20 19:51:53,147 seastar - Requested AIO slots too large, please increase request capacity in /proc/sys/fs/aio-max-nr. available:65536 requested:102052
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]WARN  2023-05-20 19:51:53,148 seastar - max-networking-io-control-blocks adjusted from 50000 to 31742, since AIO slots are unavailable
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]INFO  2023-05-20 19:51:53,148 seastar - Reactor backend: linux-aio
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]INFO  2023-05-20 19:51:53,173 [shard 0] seastar - Created fair group io-queue-0, capacity rate 2147483:2147483, limit 12582912, rate 16777216 (factor 1), threshold 2000
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]INFO  2023-05-20 19:51:53,175 [shard 0] seastar - IO queue uses 0.75ms latency goal for device 0
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]INFO  2023-05-20 19:51:53,175 [shard 0] seastar - Created io group dev(0), length limit 4194304:4194304, rate 2147483647:2147483647
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]INFO  2023-05-20 19:51:53,175 [shard 0] seastar - Created io queue dev(0) capacities: 512:2000:2000 1024:3000:3000 2048:5000:5000 4096:9000:9000 8192:17000:17000 16384:33000:33000 32768:65000:65000 65536:129000:129000 131072:257000:257000
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]INFO  2023-05-20 19:51:53,178 [shard 0] seastar - updated: blocked-reactor-notify-ms=1000000
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]INFO  2023-05-20 19:51:53,179 [shard 1] seastar - updated: blocked-reactor-notify-ms=1000000
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]INFO  2023-05-20 19:51:53,183 [shard 0] init - installing SIGHUP handler
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]INFO  2023-05-20 19:51:53,194 [shard 0] init - Scylla version 5.2.1-0.20230508.f1c45553bc29 with build-id 88ac66b1719cc7c5b7e982aa34ba5dc95909b84a starting ...
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]WARN  2023-05-20 19:51:53,194 [shard 0] init - Only 998 MiB per shard; this is below the recommended minimum of 1 GiB/shard; continuing since running in developer mode
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]WARN  2023-05-20 19:51:53,194 [shard 0] init - I/O Scheduler is not properly configured! This is a non-supported setup, and performance is expected to be unpredictably bad.
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info] Reason found: none of --max-io-requests, --io-properties and --io-properties-file are set.
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]To properly configure the I/O Scheduler, run the scylla_io_setup utility shipped with Scylla.
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]Connecting to http://localhost:10000
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]Starting the JMX server
2023-05-20T19:51:53Z app[32874e1dc11785] ewr [info]JMX is enabled to receive remote connections on port: 7199
2023-05-20T19:51:54Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:54,642 INFO success: rsyslog entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-20T19:51:54Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:54,642 INFO success: scylla entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-20T19:51:54Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:54,642 INFO success: scylla-housekeeping entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-20T19:51:54Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:54,642 INFO success: scylla-jmx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-20T19:51:54Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:54,642 INFO success: scylla-node-exporter entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-20T19:51:54Z app[32874e1dc11785] ewr [info]2023-05-20 19:51:54,642 INFO success: sshd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]Traceback (most recent call last):
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 196, in <module>
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]    args.func(args)
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 122, in check_version
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]    current_version = sanitize_version(get_api('/storage_service/scylla_release_version'))
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 80, in get_api
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]    return get_json_from_url("http://" + api_address + path)
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 75, in get_json_from_url
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]    raise RuntimeError(f'Failed to get "{path}" due to the following error: {retval}')
2023-05-20T19:51:58Z app[32874e1dc11785] ewr [info]RuntimeError: Failed to get "http://localhost:10000/storage_service/scylla_release_version" due to the following error: <urlopen error [Errno 111] Connection refused>

EDIT:

One specific line I see of concern is ERROR 2023-05-21 11:37:09,295 [shard 0] init - Startup failed: std::_Nested_exception<std::runtime_error> (Couldn't resolve listen_address): std::system_error (error C-Ares:12, 32874e1dc11785.vm.scylla-db.internal: Timeout)

I have added a yaml file with enable_ipv6_dns_lookup: true and ADD ./scylla.yaml /etc/scylla/scylla.yaml to my docker build but I still see this issue. This address resolves fine if I ssh into the node.

I also notice in the netstat output that port 9042 is not present, nor 10000

Updated dockerfile:

FROM scylladb/scylla:5.2

ADD ./cassandra-rackdc.properties.dc1 /etc/scylla/cassandra-rackdc.properties
ADD ./scylla.yaml /etc/scylla/scylla.yaml

ENTRYPOINT /docker-entrypoint.py --seeds "${SEEDS}" --smp "${SMP}" --memory "${MEM}" --overprovisioned "${OVERPROV}" --api-address "${API_ADDR}" --listen-address "${API_ADDR}" --rpc-address "${API_ADDR}" --alternator-address "${API_ADDR}" --broadcast-address "${ADV_ADDR}" --broadcast-rpc-address "${ADV_ADDR}"

I believe I solved the issue!!!

The issue was that it tries to resolve the 32874e1dc11785.vm.scylla-db.internal,4d891407f20787.vm.scylla-db.internal records as IPv4s (A record). When I placed in the IPv6 directly, it worked just fine for one node (scylla-1).

However scylla-0 goes into this loop of these logs:

Logs
2023-05-24T02:03:42Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:03:42,364 [shard 0] compaction - [Compact system.local 35297e30-f9d7-11ed-918a-eac84b12b519] Compacted 2 sstables to [/var/lib/scylla/data/system/local-7ad54392bcdd35a684174e047860b377/me-14-big-Data.db:level=0]. 81kB to 40kB (~50% of original) in 6ms = 6MB/s. ~256 total partitions merged to 1.
2023-05-24T02:04:19Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:19,296 [shard 0] gossip - No gossip backlog; proceeding
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,330 [shard 0] init - Shutting down group 0 service
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,330 [shard 0] init - Shutting down group 0 service was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,330 [shard 0] init - Shutting down storage service notifications
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,330 [shard 0] init - Shutting down storage service notifications was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,330 [shard 0] init - Shutting down system distributed keyspace
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,330 [shard 0] init - Shutting down system distributed keyspace was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,330 [shard 0] init - Shutting down migration manager notifications
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down migration manager notifications was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down storage service notifications
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down storage service notifications was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down sstables loader API
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down sstables loader API was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down sstables loader
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down sstables loader was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down repair API
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down repair API was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down messaging service API
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down messaging service API was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,331 [shard 0] init - Shutting down storage service messaging
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,332 [shard 0] init - Shutting down storage service messaging was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,332 [shard 0] init - Shutting down cdc log service
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,332 [shard 0] init - Shutting down cdc log service was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,332 [shard 0] init - Shutting down CDC Generation Management service
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,333 [shard 0] init - Shutting down CDC Generation Management service was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,333 [shard 0] init - Shutting down repair service
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,333 [shard 0] task_manager - Stoppping module repair
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,333 [shard 0] task_manager - Unregistered module repair
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,336 [shard 1] task_manager - Stoppping module repair
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,336 [shard 1] task_manager - Unregistered module repair
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 0] init - Shutting down repair service was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 0] init - Shutting down drain storage proxy
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 0] hints_manager - Asked to stop
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 0] hints_manager - Stopped
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 0] hints_manager - Asked to stop
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 0] hints_manager - Stopped
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 1] hints_manager - Asked to stop
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 1] hints_manager - Stopped
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 1] hints_manager - Asked to stop
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 1] hints_manager - Stopped
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 0] init - Shutting down drain storage proxy was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,337 [shard 0] init - Shutting down stream manager api
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,338 [shard 0] init - Shutting down stream manager api was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,338 [shard 0] init - Shutting down stream manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,338 [shard 0] init - Shutting down stream manager was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,338 [shard 0] init - Shutting down storage proxy RPC verbs
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,338 [shard 0] init - Shutting down storage proxy RPC verbs was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,338 [shard 0] init - Shutting down snitch API
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down snitch API was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down sstables format selector
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down sstables format selector was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down Raft
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down Raft was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down migration manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] migration_manager - stopping migration service
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 1] migration_manager - stopping migration service
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down migration manager was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down forward service
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down forward service was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,339 [shard 0] init - Shutting down database
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,374 [shard 1] large_data - Waiting for 0 background handlers
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,374 [shard 1] database - Shutting down commitlog
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,374 [shard 0] large_data - Waiting for 0 background handlers
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,374 [shard 0] database - Shutting down commitlog
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 1] database - Shutting down commitlog complete
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] database - Shutting down commitlog complete
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] database - Shutting down system dirty memory manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 1] database - Shutting down system dirty memory manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] database - Shutting down dirty memory manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 1] database - Shutting down dirty memory manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] database - Shutting down memtable controller
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 1] database - Shutting down memtable controller
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 1] database - Closing user sstables manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] database - Closing user sstables manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] database - Closing system sstables manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 1] database - Closing system sstables manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] database - Stopping querier cache
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 1] database - Stopping querier cache
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] database - Stopping concurrency semaphores
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 1] database - Stopping concurrency semaphores
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 1] database - Joining memtable update action
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] database - Joining memtable update action
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] init - Shutting down database was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,375 [shard 0] compaction_manager - Asked to stop
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] compaction_manager - Stopped
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 1] compaction_manager - Asked to stop
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 1] compaction_manager - Stopped
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down task_manager
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down task_manager was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down service_memory_limiter
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down service_memory_limiter was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down sst_dir_semaphore
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down sst_dir_semaphore was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down storage_service
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] storage_service - Stopped node_ops_abort_thread
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 1] storage_service - Stopped node_ops_abort_thread
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down storage_service was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down direct_failure_detector
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down direct_failure_detector was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down fd_pinger
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down fd_pinger was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down raft_address_map
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down raft_address_map was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] init - Shutting down gossiper
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] gossip - My status = UNKNOWN
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]WARN  2023-05-24 02:04:24,376 [shard 0] gossip - No local state or state is in silent shutdown, not announcing shutdown
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,376 [shard 0] gossip - Disable and wait for gossip loop started
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] gossip - Gossip is now stopped
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] init - Shutting down gossiper was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] init - Shutting down messaging service
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] messaging_service - Stopping nontls server
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] messaging_service - Stopping tls server
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] messaging_service - Stopping tls server - Done
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] messaging_service - Stopping client for address: fdaa:1:e6d1:a7b:15e:46c7:cdde:2:0
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] messaging_service - Stopping client for address: fdaa:1:e6d1:a7b:15e:46c7:cdde:2:0 - Done
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] messaging_service - Stopping nontls server - Done
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 1] messaging_service - Stopping nontls server
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 1] messaging_service - Stopping tls server
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 1] messaging_service - Stopping tls server - Done
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 1] messaging_service - Stopping nontls server - Done
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] init - Shutting down messaging service was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] init - Shutting down service level controller
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] service_level_controller - update_from_distributed_data: configuration polling loop aborted
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] init - Shutting down service level controller was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,749 [shard 0] init - Shutting down API server
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,750 [shard 0] init - Shutting down API server was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,750 [shard 0] init - Shutting down tracing instance
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,750 [shard 0] init - Shutting down tracing instance was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,750 [shard 0] init - Shutting down migration manager notifier
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,750 [shard 0] init - Shutting down migration manager notifier was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,750 [shard 0] init - Shutting down prometheus API server
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,750 [shard 0] init - Shutting down prometheus API server was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,751 [shard 0] init - Shutting down sighup
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]INFO  2023-05-24 02:04:24,751 [shard 0] init - Shutting down sighup was successful
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]ERROR 2023-05-24 02:04:24,751 [shard 0] init - Startup failed: std::runtime_error (Failed to learn about other nodes' tokens during bootstrap. Make sure that:
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info] - the node can contact other nodes in the cluster,
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info] - the `ring_delay` parameter is large enough (the 30s default should be enough for small-to-middle-sized clusters),
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info] - a node with this IP didn't recently leave the cluster. If it did, wait for some time first (the IP is quarantined),
2023-05-24T02:04:24Z app[32874e1dc11785] ewr [info]and retry the bootstrap.)

It seems like it gets stuck compacting (weird since there is no data on the node), while the scylla-1 node is able to get past this:

2023-05-24T02:03:44Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:44,258 [shard 0] compaction - [Compact system_schema.dropped_columns 364a5780-f9d7-11ed-8272-d8525d9b0553] Compacted 10 sstables to [/var/lib/scylla/data/system_schema/dropped_columns-5e7583b5f3f43af19a39b7e1d6f5f11f/me-22-big-Data.db:level=0]. 409kB to 40kB (~10% of original) in 6ms = 6MB/s. ~1280 total partitions merged to 6.
2023-05-24T02:03:56Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:56,183 [shard 0] gossip - No gossip backlog; proceeding
2023-05-24T02:03:56Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:56,183 [shard 0] init - allow replaying hints
2023-05-24T02:03:56Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:56,184 [shard 0] init - Launching generate_mv_updates for non system tables
2023-05-24T02:03:56Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:56,184 [shard 0] init - starting the view builder
2023-05-24T02:03:56Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:56,186 [shard 0] init - starting native transport
2023-05-24T02:03:56Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:56,187 [shard 0] cql_server_controller - Starting listening for CQL clients on 0.0.0.0:9042 (unencrypted, non-shard-aware)
2023-05-24T02:03:56Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:56,187 [shard 0] cql_server_controller - Starting listening for CQL clients on 0.0.0.0:19042 (unencrypted, shard-aware)
2023-05-24T02:03:56Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:56,187 [shard 0] init - serving
2023-05-24T02:03:56Z app[4d891407f20787] ewr [info]INFO  2023-05-24 02:03:56,188 [shard 0] init - Scylla version 5.2.1-0.20230508.f1c45553bc29 initialization completed.

This is especially strange as these 2 nodes are identical.

Here is more detailed log output (tried to clean the prefixes as best I could) https://pastebin.com/raw/Se8fz13Y

Seems like the nodes diverge after the gossip - No gossip backlog; proceeding line. The scylla-1 node starts, where as the scylla-0 shuts down.

I see ‘–listen-address 0.0.0.0 --rpc-address 0.0.0.0’ looks like a mixture of IPv4 and IPv6 ?

Seems like that was it! Just needed a second pair of eyes, tysm!

1 Like