ScyllaDB Enterprise Release 2024.2.0

ScyllaDB Enterprise Release 2024.2.0

The ScyllaDB team is pleased to announce the release of ScyllaDB Enterprise 2024.2.0 Release, a production-ready ScyllaDB Enterprise Feature (Short Term Support) Release.

With 2024.2 out, 2023.1 LTS and 2024.1 LTS are still supported.

More information on ScyllaDB Long Term Support (LTS) policy is available here.

The ScyllaDB Enterprise 2024.2 release is based on ScyllaDB Open Source 6.0, and includes all the features available in 6.0 like:

  • Tablets, a dynamic way to distribute data across nodes that significantly improves scalability
  • Strongly consistent topology, Auth, and Service Level updates

Note that tablets will not be the default when creating a new Keyspace in ScyllaDB Enterprise (see below).

In addition, 2024.2 includes enterprise-only features such as improved network compression (see below), and file-based streaming for tablets, which improves ScyllaDB’s elasticity even further, and a new FIPS enabled Docker Image.

Related Links

ScyllaDB Enterprise customers are encouraged to upgrade to ScyllaDB Enterprise 2024.2, and are welcome to contact our Support Team with questions.

Tablets Limited Availability

In this release, ScyllaDB enabled Tablets, a new data distribution algorithm as a better alternative to the legacy vNodes approach inherited from Apache Cassandra.

While the vNodes approach statically distributes all tables across all nodes and shards based on the token ring, the Tablets approach dynamically distributes each table to a subset of nodes and shards based on its size. In the future, distribution will use CPU, OPS, and other information to further optimize the distribution.

In particular, Tablets provide the following:

  • Faster scaling and topology changes. New nodes can start serving reads and writes as soon as the first Tablet is migrated. Together with Strongly Consistent Topology Updates (below), this also allows users to add multiple nodes simultaneously.
  • Automatic support for mixed clusters with different core counts. Manual vNode updates are not required.
  • More efficient operations on small tables., since such tables are placed on a small subset of nodes and shards.

Note that you can run a cluster with some of the Keyspaces with Tablets disabled, and some with tablets enabled for as long as you wish. The scaling improvement will be partial, and limited to Keyspaces with Tables enabled.

Limited Availability

Currently, tablets are ideal for new clusters you frequently scale out or in and have one main large table and potentially many tiny onesScyllaDB Support can help you determine if Tablets in release 2024.2 are a good solution for your use case.
In particular, Tablets Keyspaces are not yet enabled for the following features:

  • CDC
  • LWT
  • Counters
  • Alternator (Amazon DynamoDB API) - see below.

Alternator support Tablets with the following:

  • with Alternator Write Isolation of with forbid or unsafe, which does not use LWT (not always or only_rmw_uses_lwt)
  • Using initial_tablets tag.

Read more about Tablets here.

Using Tablets

Tablets are disabled by default for new Keyspaces. To use Tablets,

create a new keyspace with tablets = { 'enabled': true }.

For example:

CREATE KEYSPACE my_keyspace

WITH replication = { 'class': 'NetworkTopologyStrategy', 'replication_factor': 3 }

AND tablets = { 'enabled': true }

All tables created in this Keyspace will use Tablets by default.

You can set the initial number of Tablets per table using the “initial” parameter:

CREATE KEYSPACE ... WITH TABLETS = { 'enabled': true, 'initial': 1 }

(See CREATE KEYSPACE docs)

Note: you can not ALTER an existing Keyspace to switch between Tablets and vNode based table and back. We will remove these restrictions in upcoming releases.

Note: Tablets are default in Scylla Open Source 6.0. When Upgrading from ScyllaDB Open Source 6.0 to ScyllaDB Enterprise 2024.2, Keyspaces with Tablets will continue to work with Tablets.

You can use DESCRIBE to check if a keyspace or tables are using Tablets.

Procedures

With Tablets, the Replication Factor (RF) cannot be updated to a value higher than the number of nodes per Data Center (DC). This feature protects the Admin from setting an impossible-to-support RF. This affects the following operations:

Node Decommission / Remove

Starting from 2024.2, you cannot decommission or remove a node if the resulting number of nodes would be smaller than the largest non-zero replication factor (for any keyspace) in this DC

For example:

1 DC, 5 nodes, a KS with RF=5

The decommission request will fail

The Replication Factor (RF) of Keyspaces must be less than or equal to the number of available nodes per Data Center (DC)

Once a tablets-enabled Keyspace has tables, you can not ALTER its Replication Factor to be greater than the number of available nodes per DC.

If you create such a Keyspace, you won’t be able to create Tables until you fix the RF or add more nodes.

Monitor Tablets

To Monitor Tablets in real time, upgrade ScyllaDB Monitoring Stack to release 4.7, and use the new dynamic Tablet panels, below.

Driver Support

The Following Drivers support Tablets

  • Java driver 4.x, from 4.18.0.2
  • Java driver 3.x, from 3.11.5.2
  • Python driver, from 3.26.6
  • Gocql driver, from 1.13.0
  • Rust driver from 0.13.0

Legacy ScyllaDB and Apache Cassandra drivers will continue to work with ScyllaDB but will be less efficient when working with tablet-based Keyspaces.

File based streaming for Tablets

File-based streaming is a ScyllaDB Enterprise-only feature that optimizes tablet migration.

In ScyllaDB Open Source, migrating tablets is performed by streaming mutation fragments, which involves deserializing SSTable files into mutation fragments and re-serializing them back into SSTables on the other node. In ScyllaDB Enterprise, migrating tablets is performed by streaming entire SStables, which does not require (de)serializing or processing mutation fragments. As a result, less data is streamed over the network, and less CPU is consumed, especially for data models that contain small cells.

File-based streaming is used for tablet migration in all keyspaces created with tablets enabled.

More in Docs.

Strongly Consistent Topology Updates

With Raft-managed topology enabled, all topology operations are internally sequenced consistently. A centralized coordination process ensures that topology metadata is synchronized across the nodes on each step of a topology change procedure.

This makes topology updates fast and safe, as the cluster administrator can trigger many topology operations concurrently, and the coordination process will safely drive all of them to completion. For example, multiple nodes can be bootstrapped concurrently, which couldn’t be done with the previous gossip-based topology.

Strongly Consistent Topology Updates is now the default for new clusters, and should be enabled after upgrade for existing clusters.

Strongly Consistent Auth Updates

System-auth-2 is a reimplementation of the Authentication and Authorization systems in a strongly consistent way on top of the Raft sub-system.

This means that Role-Based Access Control (RBAC) commands like create role or grant permission are safe to run in parallel without a risk of getting out of sync with themselves and other metadata operations, like schema changes.

As a result, there is no need to update system_auth RF or run repair when adding a DataCenter.

Strongly Consistent Service Levels

Service Levels allow you to define attributes like timeout per workload.

Service levels are now strongly consistent using Raft, like Schema, Topology and Auth.

#17926

Improved network compression for intra-node RPC

This release adds new Enterprise only RPC compression improvements for node to node communication:

  • Using zstd instead of lz4
  • Using a shared dictionary re-trained periodically on the traffic, instead of the message by message compression.

Below is a comparison of compressions algorithms on different types of data.

Note that dictionary based compression can be used with either lz4 or zstd.

Actual compression is very much workload-dependent and can vary between use cases.

Describe Schema with Internals

Until this release, CQL DESCRIBE SCHEMA was not sufficient to do a full schema restore from backup. For example, it lacks information about dropped columns.

In 6.0, the DESC SCHEMA WITH INTERNALS command provides more information, streamlining the restore process.

#16482

Alternator RBAC

Authorization: Alternator supports Role-Based Access Control (RBAC). Control is done via CQL. #5047

Native Nodetool

The nodetool utility provides simple command-line interface operations and attributes.

ScyllaDB inherited the Java based nodetool from Apache Cassandra. In this release, the Java implementation was replaced with a backward-compatible native nodetool.

The native nodetool works much faster. Unlike the Java version ,the native nodetool is part of the ScyllaDB repo, and allows easier and faster updates.

Removing the JMX Server

With the Native Nodetool (above), the JMX server has become redundant and will no longer be part of the default ScyllaDB Installation or image.

If you are using the JMX server directly, not via nodetool, you can either:

Related issues: #15588 #18566 #18472 #18566

As part of moving to native tooling and away from Java tools, we will deprecate SSTableloader in future versions of ScyllaDB Enterprise.

You can use the Load and Stream to upload SSTables directly to Scylla, either from Apache Cassandra or other ScyllaDB clusters. We are also deprecating the Java version of nodetool, which was replaced by a compatible native version (see above).

Maintenance Mode

Maintenance mode is a new mode in which the node does not communicate with clients or other nodes and only listens to the local maintenance socket and the REST API. It can be used to fix damaged nodes – for example, by using nodetool compact or nodetool scrub. In maintenance mode, ScyllaDB skips loading tablet metadata if it is corrupted to allow an administrator to fix it.

#5489

Maintenance Socket

The Maintenance Socket provides a new way to interact with ScyllaDB from within the node it runs on. It is mainly for debugging. You can use CQLSh with the Maintenance Socket as described in the Maintenance Socket docs. #16172

Deployment

  • Ubuntu 24.04 is now supported.

  • RHEL / CentOS 7 support is deprecated.

  • Amazon Linux 2 is deprecated and replaced with Amazon Linux 2023

  • Debian 10 support is deprecated.

  • The setup utility now works with disks that do not have UUIDs, such as those in some virtualized environments #13803

  • The scylladb-kernel-conf package tunes the Linux kernel scheduler via sysfs to improve latency. These tunings were lost in Linux 5.13+ due to kernel changes. They are now restored. #16077

  • Docker: can not connect to Scylla 5.4 with CQLSh without providing host IP #16329

  • Update rust packages

    • “Rustix”
    • “chrono”
      #15772
  • On Ubuntu, the installer now handles conflicts between a system process updating apt metadata and the installer itself.#16537

See ScyllaDB Enterprise Release 2024.2.0 - more improvments