The ScyllaDB team announces ScyllaDB 2025.1.2, a bug-fix production-ready patch release for ScyllaDB 2025.1 LTS Release.
Cluster Level Repair for Tablets
This version includes a new nodetool command: nodetool cluster, for cluster wide operations.
The first cluster-level command is nodetool cluster repair for Tablets.
The command uses a new admin REST API /storage_service/tablets/repair .
Unlike “nodetool repair” which runs at the node level, running cluster-level repair synchronizes all data on all nodes in the cluster, for Tablets only. If you are using both vNode and Tablets Keyspaces, make sure to use both commands.
Scylla Manager version 3.5 and later automatically use both commands when required.
2025.1.2 also includes multiple bug fixes (below).
Related Links
- Get ScyllaDB 2025.1
- Upgrade from ScyllaDB Enterprise 2024.x to ScyllaDB 2025.1
- Upgrade from ScyllaDB Open Source 6.2 to ScyllaDB Enterprise 2025.1.x
- Submit a ticket
The following issues are fixed in this release:
CQL
- Enforce Tablets only cluster - a new config option, tablets_mode_for_new_keyspaces=enforced, allows admin to enforce Tablets only Keyspaces in the cluster level.
This enforcement will be used in the upcoming ScyllaDB Cloud X Cloud clusters.
- Alter keyspace setting dc rf=0 followed by decommission of said dc causes subsequent failures #22688
Correctness
- compacting_reader: decorated-key passed by reference to the compactor is moved. The compactor might use this moved-from key later to obtain tombstone GC information, which will result in incorrect tombstone GC decisions and possibly data resurrection. #23291
- Partitions are (temporarily) missing when combining range scans with SELECT DISTINCT or PER PARTITION LIMIT, and can result in omitting records from the table if they do read-repair. #20084
Performance
- Multiple filesystems on a single disk are not treated well by the Seastar I/O scheduler #23820. If a single disk (or RAID array) is shared between mountpoints (as separate partitions or LVM volumes), the Seastar I/O scheduler doesn’t recognize that and treats each mountpoint as an independent disk. Since the actual bandwidth is shared, this results in an over-estimation of the disk capabilities and therefore an increase in latency.
Stability
-
A rare race condition may cause assertion `_u.st == state::future’ failed. (reader concurrency semaphore related) #22919
-
Read-repair issue may lead to use-after-move and potential exit, for example when running a test with read repair and trace log #21907 #21714 #23512 #23513
-
Exception during node shutdown, for example, replace node with the same IP #23325 #23305 #21815
-
Tablets: Truncate or drop table after tablet migration might cause assert and unexpected exit #18059
-
Tablets: Failed to complete splitting of table after removing an MV during tablet splitting, causing infinite split retry loop #21859
-
Since Scylla 6.0, cache and memtable cells between 13 kiB and 128 kiB are getting allocated in the standard allocator rather than inside LSA segments. This can result in out of memory issue. #22941 #22389 #23781
-
Access to disengaged optional when rewriting bloom filter of a new sstable #23484
-
Coredump after scylla starting after audit log was enabled #22973
-
Coredump during RefuseConnectionWithBlockScyllaPortsOnBannedNode nemesis #23348
-
Possible race condition in the task manager may cause an error, for example during repair #22316
-
Tablets: Finalize tablet splits earlier. If there is a large load balancing backlog, split finalization may be delayed arbitrarily long and we end up with large tablets. #21762
-
Tablets: handle_tablet_migration: do not continue if a global metadata barrier is executed #22792
-
segfault when dumping semaphore diagnostics on SIGQUIT #22756
-
Tablets: Tablet allocation on table creation overloads nodes with fewer shards #23378
-
Tablets: rare race condition when aborting when adding a node to the cluster #23222
-
Rare integer overflow in abstract_read_executor::execute when running cross DC repair #23314
-
Stop node while restarting failed with seastar::gate_closed_exception (gate closed) #23153
-
Tablet split may fail with Assertion `_promise’ failed #22715
-
Possible out-of-space issue when adding a new DC. Currently, when we rebuild a tablet, we stream data from all replicas. This creates a lot of redundancy, wastes bandwidth and CPU resources. This is now fixed by splitting the streaming stage of tablet rebuild into two phases: first we stream tablet’s data from only one replica and then repair the tablet.
Alternator
- Alternator streams page size is unlimited #23534. GetRecords operation has a “Limit” parameter on
- how many records to return. DynamoDB has an upper limit on this Limit parameter is 1000 - but Alternator didn’t.
Tracing:
- gms::gossip_digest’s formatter does not print cluster_id, partitioner, or group0_id #23142
- server/transport - Tracing omits User-supplied timestamps for Prepared Statements #23173
- streaming: wrong reason for reason of streaming failure #22834
Tooling and API