I just updated a 3 node cluster from 5.1 to scylla 5.2, but it does not start up. It seems scylla just hangs after the upgrade:
– A start job for unit scylla-server.service has begun execution.
– The job identifier is 145767327.
Aug 23 22:39:49 pcdev-1 scylla[2834247]: Scylla version 5.2.6-0.20230730.58acf071bf28 with build-id 17961be569f8503b27ff284a8de1e00a9d83811e starting …
Aug 23 22:39:49 pcdev-1 scylla[2834247]: command used: “/usr/bin/scylla --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --memory 10G --reserve-memory 52G --overprovisioned --kernel-page-cache 1 --unsafe-bypass-fsync 1 --io-properties-file=/etc/scylla.d/io_properties.yaml --developer-mode=1 --cpuset 0-3 --smp 4”
Aug 23 22:39:49 pcdev-1 scylla[2834247]: parsed command line options: [log-to-syslog, (positional) 1, log-to-stdout, (positional) 0, default-log-level, (positional) info, network-stack, (positional) posix, memory, (positional) 10G, reserve-memory, (positional) 52G, overprovisioned, kernel-page-cache, (positional) 1, unsafe-bypass-fsync, (positional) 1, io-properties-file: /etc/scylla.d/io_properties.yaml, developer-mode: 1, cpuset, (positional) 0-3, smp, (positional) 4]
Aug 23 22:39:49 pcdev-1 scylla[2834247]: seastar - Reactor backend: linux-aio
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - Creation of perf_event based stall detector failed, falling back to posix timer: std::system_error (error system:13, perf_event_open() failed: Permission denied)
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - Created fair group io-queue-66305, capacity rate 192:50000, limit 23649164, rate 16777216 (factor 1), threshold 11227761
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - IO queue uses 1.41ms latency goal for device 66305
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - Created io group dev(66305), length limit 131072:65536, rate 192000:50000000
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - Created io queue dev(66305) capacities: 512:11227761:16830904 1024:11270710:16884590 2048:11356610:16991964 4096:11528408:17206712 8192:11872006:17636210 16384:12559200:18495202 32768:13933590:20213190 65536:16682369:23649164 131072:22179928:X
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - Created fair group io-queue-0, capacity rate 2147483:2147483, limit 12582912, rate 16777216 (factor 1), threshold 2000
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - IO queue uses 0.75ms latency goal for device 0
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - Created io group dev(0), length limit 4194304:4194304, rate 2147483647:2147483647
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - Created io queue dev(0) capacities: 512:2000:2000 1024:3000:3000 2048:5000:5000 4096:9000:9000 8192:17000:17000 16384:33000:33000 32768:65000:65000 65536:129000:129000 131072:257000:257000
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 1] seastar - Creation of perf_event based stall detector failed, falling back to posix timer: std::system_error (error system:13, perf_event_open() failed: Permission denied)
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 3] seastar - Creation of perf_event based stall detector failed, falling back to posix timer: std::system_error (error system:13, perf_event_open() failed: Permission denied)
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 2] seastar - Creation of perf_event based stall detector failed, falling back to posix timer: std::system_error (error system:13, perf_event_open() failed: Permission denied)
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] seastar - updated: blocked-reactor-notify-ms=1000000
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 1] seastar - updated: blocked-reactor-notify-ms=1000000
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 3] seastar - updated: blocked-reactor-notify-ms=1000000
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 2] seastar - updated: blocked-reactor-notify-ms=1000000
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - Unknown option : max_size_of_hints_in_progress
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - installing SIGHUP handler
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - Scylla version 5.2.6-0.20230730.58acf071bf28 with build-id 17961be569f8503b27ff284a8de1e00a9d83811e starting …
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting prometheus API server
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - creating snitch
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting tokens manager
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting effective_replication_map factory
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting migration manager notifier
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting lifecycle notifier
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - creating tracing
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting API server
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - Scylla API server listening on 0.0.0.0:10000 …
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] service_level_controller - update_from_distributed_data: starting configuration polling loop
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting system keyspace
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting gossiper
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - seeds={192.168.178.101, 192.168.178.102, 192.168.178.103}, listen_address=192.168.178.103, broadcast_address=192.168.178.103
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting Raft address map
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting direct failure detector pinger service
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting direct failure detector service
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - initializing storage service
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] storage_service - Started node_ops_abort_thread
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 1] storage_service - Started node_ops_abort_thread
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 2] storage_service - Started node_ops_abort_thread
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 3] storage_service - Started node_ops_abort_thread
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - starting per-shard database core
Aug 23 22:39:49 pcdev-1 scylla[2834247]: [shard 0] init - creating and verifying directories
CPU and disk is idling. The other two nodes are still one 5.1 and still up.
Any ideas whats wrong or what I could check to identify the problem?