Bad_alloc issue, data not accessible with large collections

Guy · November 3, 2024, 7:00am

Originally from the User Slack](https://scylladb-users.slack.com/)

@Matheus_Salvia: some rows in my table seem to be unaccessible:
Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed for sessions.sessions_by_version - received 0 responses and 1 failures from 3 CL=ALL." info={'consistency': 'ALL', 'required_responses': 3, 'received_responses': 0, 'failures': 1}
Trying to repair the affected range immediately fails with:
Repair tablet for keyspace=sessions status=failed: std::runtime_error (repair[d29c1d5f-0b9d-43c7-8175-8829d4866c7a]: 1 out of 1 ranges failed, keyspace=sessions, tables=["sessions_by_version"], repair_reason=repair, nodes_down_during_repair={}, aborted_by_user=false, failed_because=seastar::nested_exception: std::runtime_error (Failed to repair for keyspace=sessions, cf=sessions_by_version, range=(3999196469105000447,4003700068732370943]) (while cleaning up after std::bad_alloc (std::bad_alloc)))
seems like a bug. I’m on 6.1.0. Any ideas?

@Felipe_Cardeneti_Mendes: bad_alloc is often heavy memory pressure. Hard to say its a bug from just these logs. Large partition maybe? This is probably the failure cause (which should be printed to the coordinator node stderr as you receive a server error via CQL)

@Matheus_Salvia: large part is a possibility, this table can have some problematic ones.
did a rolling restart and it seems like it’s working now. if it happens again I’ll search for the log in the coordinator like you said
nvm, this table is tame: Compacted partition maximum bytes: 8409007

@Felipe_Cardeneti_Mendes: Collections by any chance? Either way, best to open an issue.

@Matheus_Salvia: yeah it does have arrays

@Felipe_Cardeneti_Mendes: that could be a reason. Check if there’s anything absurd on system.large_cells

@Matheus_Salvia: sessions | sessions_by_version | me-3gj6_1erf_0xfpc2mbwlidwyx848-big-Data.db | 4186749
is that absurd by scylla standards?
4 megs

@Felipe_Cardeneti_Mendes: there should be a column collection_elements with the number of elements in that collection. Collections should be reasonably small and - preferably - infrequently updated.

4 mb is reasonable (not great).

@Matheus_Salvia: max elements around 10k

@Felipe_Cardeneti_Mendes: Yeah, that’s a lot.

@Matheus_Salvia: and I don’t think we do updates on these
what’s a rule-of-thumb number I should keep under?

@Felipe_Cardeneti_Mendes: > Collections are meant for storing/denormalizing a relatively small amount of data. They work well for things like “the phone numbers of a given user”, “labels applied to an email”, etc. But when items are expected to grow unbounded (“all messages sent by a user”, “events registered by a sensor”…), then collections are not appropriate, and a specific table (with clustering columns) should be used.
https://opensource.docs.scylladb.com/stable/cql/types.html#noteworthy-characteristics

@Matheus_Salvia: so like a couple hundred?

@Felipe_Cardeneti_Mendes: yeah…

@avi: Set up metrics, look at LSA and non-LSA memory usage

@Matheus_Salvia:
dip in the middle is the restart. what should I look for here?
lsa - Standard allocator failure, increasing head-room in section 0x607006bac680 to 1048576 [B]; trace: 0x5e8296e 0x5e82f80 0x5e83288 0x1ffc22c 0x1f9dcc2 0x1f5af0c 0x1f5b32a 0x20cd1fb 0x1fba7af 0x1fb8c7b 0x204f594 0x205c2fa 0x205baa2 0x2059984 0x1ba568b 0x598b05f 0x598c5ca 0x598d7b7 0x59b0c40 0x594da8a /opt/scylladb/libreloc/libc.so.6+0x97506 /opt/scylladb/libreloc/libc.so.6+0x11b40b
am I just OOM and need bigger nodes?
WARN  2024-09-18 22:13:08,903 [shard 2:strm] seastar_memory - oversized allocation: 1048576 bytes. This is non-fatal, but could lead to latency and/or fragmentation issues. Please report: at 0x5e8296e 0x5e82f80 0x5e83288 0x59352b2 0x5938084 0x20bcd96 0x13d424a 0x598b05f 0x598c5ca 0x598d7b7 0x59b0c40 0x594da8a /opt/scylladb/libreloc/libc.so.6+0x97506 /opt/scylladb/libreloc/libc.so.6+0x11b40b
@avi: Please decode the trace via https://backtrace.scylladb.com
Bigger nodes won’t help, this is a memory fragmentation problem
(non-LSA memory is low, which means overall memory consumption is fine)
(LSA memory = memtable + cache)
oh, the 10k element collections definitely are bad here

@Matheus_Salvia: INFO 2024-09-19 19:11:20,531 [shard 5:comp] compaction - [Compact sessions.sessions_by_version f3091cb0-76ba-11ef-ab67-5340763003bd] Compacting of 2 sstables interrupted due to: std::bad_alloc (std::bad_alloc), at 0x5e8296e 0x5e82f80 0x5e83288 0x24b7d9a 0x24b615e 0x5d5d3d6
INFO 2024-09-19 19:25:48,793 [shard 5:stmt] lsa - Standard allocator failure, increasing head-room in section 0x604003b94680 to 8388608 [B];
INFO 2024-09-19 19:25:49,839 [shard 5:stmt] lsa - Standard allocator failure, increasing head-room in section 0x604003b94680 to 16777216 [B];

backtrace that I got by using the provided decoder

WARN 2024-09-19 19:37:18,864 [shard 0:strm] seastar_memory - oversized allocation: 8388608 bytes. This is non-fatal, but could lead to latency and/or fragmentation issues

these were all the traces I could find
these are from 6.1.1. we upgraded the cluster yesterday to verify if the problem wasn’t fixed

@avi: It seems related to those large collections

Topic		Replies	Views
Load and Stream issue with Memory Allocation ScyllaDB open-source	1	286	November 13, 2023
Bad_alloc when trying to remove a node, large collections, topology changes errors ScyllaDB collections , administration , raft , topology	0	16	May 12, 2025
ScyllaDB Manager - repair taking too long and crashing ScyllaDB scylla-manager , troubleshooting , repair	0	40	April 14, 2025
Getting logalloc::bad_alloc on 3 node cluster ScyllaDB error-message , troubleshooting , administration	1	46	March 26, 2025
Compaction task occurred bad_alloc when there is large partition ScyllaDB error-message , compaction , large-partitions	1	38	April 15, 2025

Bad_alloc issue, data not accessible with large collections

Related topics