[RELASE] Scylla 6.2 RC1

The ScyllaDB team is pleased to announce ScyllaDB Open Source 6.2 RC1, the first Release Candidate for the ScyllaDB Open Source 6.2 minor release.

ScyllaDB 6.2 introduces many Tablets improvements, new zero-token nodes (Arbiter), Alternator RBAC support and many other bug fixes and stabilizations.

We encourage you to run ScyllaDB 6.2 release candidates on your test environments; this will help ensure that an upgrade to ScyllaDB 6.2 General Availability will proceed smoothly with your workload.

Only the last two minor releases of the ScyllaDB Open Source project are supported. Once ScyllaDB Open Source 6.2 is officially released, only ScyllaDB Open Source 6.2 and ScyllaDB 6.1 will be supported, and ScyllaDB 6.0 will be retired.

Related Links

High Availability - Arbiter

There is now support for zero-token nodes. Such nodes do not replicate any data, but can participate in query coordination, and in Raft quorum voting.

One can use this to create an Arbiter: a tiebreaker node, with no data, that can help maintain quorum in the case of two symmetrical two-datacenter clusters. If one of the data centers fails, the Arbiter, deployed on a 3rd datacenter, keeps quorum on the node alive. Since the Arbiter has zero token, it does not replicate user data, and does not come with network and storage costs. #15360

Alternator RBAC

Authorization: Alternator supports Role-Based Access Control (RBAC). Control is done via CQL. #5047

More updated

Tablets

  • Performance: The tablet load balancer now tries to ensure that not only are tablets distributed evenly among nodes and starts, but that tablets for any particular table are evenly distributed. This prevents a hot table that is unevenly distributed from causing hot nodes or hot shards. #16824
  • Stability: The tablet allocator will now refrain from allocating tablets on a table created concurrently with decommission #20032
  • Performance: When tablet metadata (system.tablets table) changes, we now reload only the changed rows. #15294
  • A bug when ALTERing a keyspace that doesn’t exist, with tablets, was fixed. #19576
  • Stability: A race condition between tablet repair and tablet split (the latter happens when a table grows) has been fixed. Fixes #19378 #19416.

Tracing

  • ScyllaDB now collects cell level statistics in addition to row and tombstone statistics for result pages. The statistic is exposed in a trace message. #18996

Stability

  • The sstable primary index reader will now respond to service shutdown requests. This can happen if we’re rebuilding the bloom filter for a large sstable when the service is shut down #19453
  • A race between table drop and a counter column update was fixed. #19948
  • Stability: Fatal error during cache update during elasticity test write workload #19873. Root cause is a race between split compaction and tablet migration.
  • A regression in processing limits for the GROUP BY clause was fixed #17237 #5361 #5362
  • Raft uses log truncation to limit memory consumption. A mismatch between in-memory log truncation and on-disk log truncation was fixed. #16817 #20080
  • Topology coordinator and replacing node stop see each other after entered transition state #19025
  • A node will now ignore dns name resolution errors of seeds when restarting, as those seed names could be referring to nodes that were removed. #14945
  • Commitlog is now able to store entries larger than half a commitlog segment. This limitation caused problems with large clusters, as cluster metadata could exceed this limit. Large entries are now fragmented and split over multiple segments. #19472
  • A bug in computing whether to flush all memtables was fixed. #20301
  • A memory leak in the Paxos implementation was fixed #20602

Admin

  • A New REST API system/highest_supported_sstable_version, return the sstable format version supported across the cluster #19772
  • The internal ‘cluster feature’ mechanism now supports suppressing features, enabling simulation of upgrades. This should catch version upgrade problems earlier. #20034
  • Integrated backup and restore has been merged. A new nodetool backup and restore commands (and corresponding REST API endpoint) will copy a snapshot to and from an S3 compatible endpoint. This is a work in progress aim to replace current external (Manager Agent) backup, with Scylla Core managed backup and restore. #19890 #20305
  • A new nodetool tasks command can be used to view and manage maintenance tasks running on the node. #19201
  • ScyllaDB will now tune the number of allowed open files descriptors (LimitNOFILES) for very large nodes, reducing the chance of “Too many files” error. #20443
  • Tools: Scrub/validate compactions will now verify checksums for uncompressed sstables. #20207
  • Compaction CLEANUP jobs now run under the maintenance/streaming scheduling/group. #20582

Alternator

  • Performance: Alternator uses JSON to communicate with the client. Previously, sending very large JSON values was adjusted to avoid stalls. The destruction of these large JSON values now also avoids stalls #19968
  • Monitoring: Alternators add metrics for batch latency and size.
  • Performance: Alternator, ScyllaDB’s implementation of the DynamoDB API, has more efficient reverse queries now, reducing the gap from CQL. #20191
  • Authentication: Alternator, will reject authentication from roles that do not have the LOGIN attribute. #19735

Performance

  • In order to make sstables durable, the directory where they are placed must be flushed after they are sealed. This is now done without re-opening the directory each time, saving some cycles. #19624
  • Commitlog segments older than 24 hours will now flush corresponding memtables regardless of memory pressure. This allows more timely garbage collection of tombstones. #15971
  • ScyllaDB tracks internal maintenance work, as well as work requested by the user (for example, repair), as tasks. New virtual tasks allow ScyllaDB to track multi-node operations. #16374
  • Hinted Handoff writes to local storage and now uses the commitlog scheduling group. #18654
  • The driver for S3 access is now optimized for throughput #20074. Direct S3 access is experimental in this release.
  • Reversed queries (WITH CLUSTERING ORDER BY) are already quite efficient in ScyllaDB, yet the internal RPC protocol between nodes was kept unaware of reversed queries in order to maintain compatibility; result sets were un-reversed before sending over the wire, then re-reversed. ScyllaDB now support an alternate protocol where these wasteful transformations are avoided #12557
  • When communicating with older versions of ScyllaDB, the server uses a schema digest to see whether there is a schema mismatch or not. This is now less likely to stall when processing large schemas. #18173
  • Major compaction now supports a new option to only check existing sstables during tombstone garbage collection; this can increase the effectiveness of garbage collection for partitions that are updated frequently. Major compaction should check only the compacted sstables for the purpose of tombstone garbage collection #19728
  • The heuristics for purging tombstones during compaction were improved, leading to less tombstone accumulation. #20424 #20423
  • The system may sometimes drop the bloom filter of some sstables to save memory, and then reload it when memory is available. We no no longer reload bloom filters for sstables that are queued for deletion. #19722

CQL

  • Service levels are used to group and classify sessions. Service level names beginning with $ are now reserved. #20122
  • A CQL filtering bug when a regular column was filtered but no regular columns were selected was fixed. #10357

Materialized view

  • Performance: Materialized view updates destined to a node that has left the cluster are now dropped. #19439
  • Performance: When a materialized view’s primary key has the same columns as the base table primary key, we now optimize deletions by deleting an entire partition when possible. #8199
  • A write to a base table will now be rejected by the coordinator when one or more of the replicas has a full view update backlog. This reduces inconsistencies in materialized views. #17426
  • The system_distributed.view_build_status was moved to the system keyspace and is now managed by Raft in a strongly consistent way. #15329

Packaging

  • The jmx submodule was removed from the source tree. With nodetool now talking directly to the REST API, it is no longer necessary. JMX is still available as a separate package.

Config

  • When Service Level parameters, like timeouts, are modified, connections are adjusted in real time. #12923
  • commitlog_use_fragmented_entries - Whether or not to allow commitlog entries to fragment across segments, allowing for larger entry sizes. Default: True.
  • cql_duplicate_bind_variable_names_refer_to_same_variable - a bind variable that appears twice in a CQL query refers to a single variable (if false, no name matching is performed). Default: True. #15559
  • Option reversed_reads_auto_bypass_cache was deprecated. It’s no longer needed as Reverse reads are now mature.
  • commitlog_max_data_lifetime_in_seconds - Controls how long data remains in commit log before the system tries to evict it to sstable, regardless of usage pressure. (0 disabled). Default: 24* 60 * 60 (1 days) #15971

Monitoring

Scylla Monitoring stack 4.8.1 and later support ScyllaDB 6.2 release.

See upgrade docs for Metrics update in ScyllaDB 6.2.