Last week in scylladb.git master (issue #279; 2025-05-11)

This short report brings to light some interesting commits to scylladb.git master from the last week. Commits in the 8ffe4b0308..092a88c9b9 range are covered.

There were 83 non-merge commits from 19 authors in that period. Some notable commits:


Alternator, ScyllaDB’s implementation of the DynamoDB API, now automatically retries schema table changes that can fail due to contention.

ScyllaDB uses a data structure called partition_sstable_set to rapidly find relevant sstables for run-based compaction strategies (Leveled Compaction Strategy and Incremental Compaction Strategy). This data structure is now careful to avoid quadratic space complexity when pushed to extreme situations, which could cause excessive memory use in the past. This is achieved by noting if sstables are actually organized in runs, and if not, using a different data structure.

ScyllaDB warns on large allocations in excess of 1MB as they can cause high latency and thrash the cache. The warning threshold is now reduced to 128kB to flush out smaller violations.

ScyllaDB build normally runs in a container that provides all the build dependencies. The container now supports running nested containers, so some of the build dependencies can be container images themselves.

Raft group 0 manages metadata (topology and the schema) it restricts the number of voters to improve throughput. It is now careful not to change the voter quorum too much and so reduces leader re-election, which causes minor disruptions.

A crash related to referencing a freed statistics data structure for memtables was fixed.

The version number in the master branch was changed to 2025.3, indicating the start of the 2025.2 release stabilization cycle.

A possible use-after-free during schema changes related to the sstable_set type was fixed.

Sstable compression can use dictionaries to improve the compression ratios. We now distribute the dictionaries across all shards in a node, but we are also careful to have each NUMA node own a copy of the dictionary, to minimize performance loss due to cross-NUMA-node memory accesses.


See you in the next issue of last week in scylladb.git master!