[RELEASE] ScyllaDB 2025.1.2

tzach · April 29, 2025, 2:11pm

The ScyllaDB team announces ScyllaDB 2025.1.2, a bug-fix production-ready patch release for ScyllaDB 2025.1 LTS Release.

Cluster Level Repair for Tablets

This version includes a new nodetool command: nodetool cluster, for cluster wide operations.
The first cluster-level command is nodetool cluster repair for Tablets.
The command uses a new admin REST API /storage_service/tablets/repair .

Unlike “nodetool repair” which runs at the node level, running cluster-level repair synchronizes all data on all nodes in the cluster, for Tablets only. If you are using both vNode and Tablets Keyspaces, make sure to use both commands.

#22409 #23032

Scylla Manager version 3.5 and later automatically use both commands when required.

2025.1.2 also includes multiple bug fixes (below).

Related Links

The following issues are fixed in this release:

CQL

Enforce Tablets only cluster - a new config option, tablets_mode_for_new_keyspaces=enforced, allows admin to enforce Tablets only Keyspaces in the cluster level.

This enforcement will be used in the upcoming ScyllaDB Cloud X Cloud clusters.

Alter keyspace setting dc rf=0 followed by decommission of said dc causes subsequent failures #22688

Correctness

compacting_reader: decorated-key passed by reference to the compactor is moved. The compactor might use this moved-from key later to obtain tombstone GC information, which will result in incorrect tombstone GC decisions and possibly data resurrection. #23291
Partitions are (temporarily) missing when combining range scans with SELECT DISTINCT or PER PARTITION LIMIT, and can result in omitting records from the table if they do read-repair. #20084

Performance

Multiple filesystems on a single disk are not treated well by the Seastar I/O scheduler #23820. If a single disk (or RAID array) is shared between mountpoints (as separate partitions or LVM volumes), the Seastar I/O scheduler doesn’t recognize that and treats each mountpoint as an independent disk. Since the actual bandwidth is shared, this results in an over-estimation of the disk capabilities and therefore an increase in latency.

Stability

A rare race condition may cause assertion `_u.st == state::future’ failed. (reader concurrency semaphore related) #22919
Read-repair issue may lead to use-after-move and potential exit, for example when running a test with read repair and trace log #21907 #21714 #23512 #23513
Exception during node shutdown, for example, replace node with the same IP #23325 #23305 #21815
Tablets: Truncate or drop table after tablet migration might cause assert and unexpected exit #18059
Tablets: Failed to complete splitting of table after removing an MV during tablet splitting, causing infinite split retry loop #21859
Since Scylla 6.0, cache and memtable cells between 13 kiB and 128 kiB are getting allocated in the standard allocator rather than inside LSA segments. This can result in out of memory issue. #22941 #22389 #23781
Access to disengaged optional when rewriting bloom filter of a new sstable #23484
Coredump after scylla starting after audit log was enabled #22973
Coredump during RefuseConnectionWithBlockScyllaPortsOnBannedNode nemesis #23348
Possible race condition in the task manager may cause an error, for example during repair #22316
Tablets: Finalize tablet splits earlier. If there is a large load balancing backlog, split finalization may be delayed arbitrarily long and we end up with large tablets. #21762
Tablets: handle_tablet_migration: do not continue if a global metadata barrier is executed #22792
segfault when dumping semaphore diagnostics on SIGQUIT #22756
Tablets: Tablet allocation on table creation overloads nodes with fewer shards #23378
Tablets: rare race condition when aborting when adding a node to the cluster #23222
Rare integer overflow in abstract_read_executor::execute when running cross DC repair #23314
Stop node while restarting failed with seastar::gate_closed_exception (gate closed) #23153
Tablet split may fail with Assertion `_promise’ failed #22715
Possible out-of-space issue when adding a new DC. Currently, when we rebuild a tablet, we stream data from all replicas. This creates a lot of redundancy, wastes bandwidth and CPU resources. This is now fixed by splitting the streaming stage of tablet rebuild into two phases: first we stream tablet’s data from only one replica and then repair the tablet.

Alternator

Alternator streams page size is unlimited #23534. GetRecords operation has a “Limit” parameter on
how many records to return. DynamoDB has an upper limit on this Limit parameter is 1000 - but Alternator didn’t.

Tracing:

gms::gossip_digest’s formatter does not print cluster_id, partitioner, or group0_id #23142
server/transport - Tracing omits User-supplied timestamps for Prepared Statements #23173
streaming: wrong reason for reason of streaming failure #22834

Tooling and API

scylla-nodetool: rapidjson::GenericValue::GetInt() can trigger assert if integer value overflows 32 bit int. The result nodetool netstats return a bad formatted JSON output. #23394
Update node_exporter to release to 1.9.0 #22884

Topic	Replies	Views
[RELEASE] ScyllaDB 2025.1.7 Release Notes enterprise , enterprise-release , scylladb-2025-1	160	September 14, 2025
[RELEASE] ScyllaDB 2025.2.1 Release Notes enterprise , enterprise-release , scylladb-2025-2	150	August 3, 2025
[RELEASE] ScyllaDB 2025.1.1 Release Notes enterprise , enterprise-release , scylladb-2025-1	142	April 16, 2025
[RELEASE] ScyllaDB 2025.1.3 Release Notes enterprise , enterprise-release , open-source-release , scylladb-2025-1	265	June 5, 2025
[RELEASE] ScyllaDB 2025.1.6 Release Notes enterprise , enterprise-release , scylladb-2025-1	119	August 19, 2025

[RELEASE] ScyllaDB 2025.1.2

Related topics