Getting logalloc::bad_alloc on 3 node cluster

shock7221 · March 19, 2025, 10:00pm

Installation details
#ScyllaDB version: 6.1.2-0.20240915.b60f9ef4c223
#Cluster size: 3
os (RHEL/CentOS/Ubuntu/AWS AMI): Ubuntu 22.04.5 LTS

2 of the 3 nodes are down with the following error. Any idea how I could recover these 2 nodes. It keeps restarting.

Mar 19 21:57:47 scylla-node-01 scylla[6060]:  [shard 12:main] table - failed to write sstable /var/lib/scylla/data/<keyspace>/logs-44fb69b0ca2611efb3b9235c7d5cb03f/me-3gop_1p0b_0rv342wofndsn54rc1-big-Data.db: logalloc::bad_alloc (failed to refill emergency reserve of 30 (have 26 free segments))
Mar 19 21:57:47 scylla-node-01 scylla[6060]:  [shard 12:main] table - Memtable flush failed due to: logalloc::bad_alloc (failed to refill emergency reserve of 30 (have 26 free segments)). Will retry in 10000ms
Mar 19 21:57:47 scylla-node-01 scylla[6060]:  [shard 12:main] table - failed to write sstable /var/lib/scylla/data/<keyspace>/log_details-44619ab0ca2611efb3b9235c7d5cb03f/me-3gop_1p0b_5jbxc2wofndsn54rc1-big-Data.db: logalloc::bad_alloc (failed to refill emergency reserve of 30 (have 26 free segments))
Mar 19 21:57:47 scylla-node-01 scylla[6060]:  [shard 12:main] table - Memtable flush failed due to: logalloc::bad_alloc (failed to refill emergency reserve of 30 (have 26 free segments)). Will retry in 10000ms
Mar 19 21:57:49 scylla-node-01 scylla[6060]:  [shard 12:main] table - failed to w

Botond_Denes · March 26, 2025, 12:02pm

When you restart the nodes, they replay the Commitlog, which restores the content of the Memtable, the flushing of which seems to be causing this error.
To restore your nodes, move aside the content of /var/lib/data/commitlog/, then start the node.
Note!!! This will result in the loss of the writes contained in said Commitlogs. You can try to move aside only some of the files, start the node, and if it succeeds, stop it and copy back some of the files. Repeat until all Commitlogs are restored.

In any case, make sure to repair the cluster after doing this.

Does your schema has collections by any chance?

Topic		Replies	Views
6.0.4 Major Crashes Due to Memory/Gossip Failures ScyllaDB error-message , troubleshooting , upgrade , memory	10	296	March 4, 2025
Load and Stream issue with Memory Allocation ScyllaDB open-source	1	337	November 13, 2023
The expansion of the 180-node cluster has failed ScyllaDB troubleshooting , administration , memory	6	237	October 28, 2024
Some bad_alloc on 6.0/6.1 ScyllaDB error-message , upgrade , batch	2	80	April 15, 2025
Counter updates timeouts & bad_alloc ScyllaDB	1	293	November 26, 2023

Getting logalloc::bad_alloc on 3 node cluster

Related topics