[RELEASE] ScyllaDB 5.4.0 - Part 1

The ScyllaDB team is pleased to announce the release of ScyllaDB Open Source 5.4.0, a production-ready release of our open-source NoSQL database.

ScyllaDB 5.4 introduces Repair Base Node Operations (RBNO) for all operations, experimental consistent topology update, experimental S3 backend and many more improvements and bug fixes.

Consistent schema management using Raft will be enabled automatically on upgrade (more below).

ScyllaDB is now supported on RHEL/Rocky 9, while RHEL/CentOS 7 support is deprecated.

RHEL/Rocky 8, Ubuntu 20.04, 22.04, and Debian 10,11 support continue.

Find the ScyllaDB Open Source 5.4 repository for your Linux distribution here. ScyllaDB 5.4 Docker is also available.

Only the last two minor releases of the ScyllaDB Open Source project are supported. From now on, only ScyllaDB Open Source 5.4 and ScyllaDB 5.2 will be supported, and ScyllaDB 5.1 will be retired.

(note there is no 5.3 release)

Related Links

Repair Based Node Operations (RBNO)

Repair Based Node Operations were introduced as an experimental feature in ScyllaDB Open Source 4.0. They use repair to stream data for node-operations like replace, bootstrap and others.

In 5.4, RBNO is enabled by default for all operations: remove node,rebuild,bootstrap, and decommission. Replace node operation was already enabled by default.

See Repair Base Node Operations (RBNO) docs and the “Faster, Safer Node Operations with Repair vs Streaming” blog by Asias He

Node Level Metrics

Most ScyllaDB metrics are per-shard, per-node, but not for a specific table. We now export some per-table metrics. These are exported once per node, not per shard, to reduce the number of metrics. #2198

Guardrails

Guardrails is a framework to protect ScyllaDB users and admins from common mistakes and pitfalls. In this release ScyllaDB includes a new guardrail on the replication factor. It is now possible to specify the minimum replication factor for new keyspaces via a new configuration item #8891.

This matches the same functionality in Apache Cassandra #CASSANDRA-14557

The new RF guardrails include the following configuration:

  • minimum_replication_factor_fail_threshold. Default -1 (Disabled)
  • minimum_replication_factor_warn_threshold. Default 3.

More Guardrails are expected in upcoming releases.

Security

  • It is now possible to use TLS certificates to authenticate and authorize a user to ScyllaDB. The system can be configured to derive the user role from the client certificate and derive the permissions the user has from that role. #10099

See more on certificate-authentication docs.

  • It is now possible to specify the initial superuser name and password (salted) in scylla.yaml config or command line. Note that config values become redundant as soon as auth tables are initialized. See two new config parameters auth_superuser_name, auth_superuser_salted_password below.

Deprecated and removed features

  • As part of the CQLSH refactoring (see above) Python 2is no longer required by ScyllaDB.
  • DateTieredCompactionStrategy was removed. Users should move to TimeWindowCompactionStrategy. It is already illegal to create new DateTieredCompactionStrategy tables. If you are still using DTCS, migrate to TWCS before upgrading!

Deployment and install

  • ScyllaDB is officially supported on Rocky / RHEL 9.
  • RHEL / CentOS 7 support is deprecated.
  • ScyllaDB installation will now tune the OS core dump service to allow a longer time 2 to dump cores. This is necessary since ScyllaDB allocates all memory and therefore takes a longer time to dump core if an error is encountered. #5430
  • The installer now wipes filesystem signatures from the individual disks making up a RAID array, preventing problems with reuse of disks. #13737
  • We now tune the Linux kernel’s caching of inodes (in-memory structure representing file metadata) to favor evicting inodes quickly. This aims to reduce kernel memory fragmentation when there are large numbers of sstables, as most files comprising an sstable aren’t accessed after the process starts.
  • The bundled Prometheus node_exporter has been updated to version 1.6.1.

UDF / UDA - Preview

Wasm-based User Defined Functions (UDFs) and User Defined Aggregates (UDAs) were introduced as experimental in ScyllaDB 5.1, and we are now promoting them to Preview.

The CQL syntax is compatible with Apache Cassandra. Examples:

CREATE FUNCTION sample ( arg int ) …;

CREATE FUNCTION sample ( arg text ) …;

See UDF/UDA documentation for more information.

Preview features aren’t production-ready, but are made available on a “preview” basis so that users can get early access and provide feedback. Unlike Experimental features, we are committed to the backward compatibility of a preview feature API.

Below are improvements and bug fixes in UDF/UDA.

  • The check for whether a User Defined Function (UDF) is in use by a User Defined Aggregate (UDA) before dropping it has been improved.
  • An issue when a User Defined Function (UDF) that is used in a User Defined Aggregate (UDA) is updated has been fixed. The UDA now reflects the changes in the modified UDF immediately.
  • An issue when using the counter data type in a WebAssembly UDF has been corrected.
  • Automatically parallelized aggregation queries on columns of COUNTER type have been fixed #12939
  • When defining a user-defined function, one has to specify the language in the CREATE FUNCTION statement. The language name for WebAssembly functions was renamed from “xwasm” to “wasm” in preparation for moving it out of experimental status.
  • User-defined functions can now have permission control via CQL GRANT statements. #5572
  • When a WebAssembly user-defined function is compiled, it is now compiled in a separate thread in order to avoid stalling the reactor and inducing high latency. #12904
  • The WebAssembly documentation is now user-visible, as part of making it ready for production.
  • WebAssembly just-in-time compilation moved to a separate thread (see above). The thread now blocks all signals, preventing incorrect signal handling when it was randomly picked to handle a signal. #13228
  • User-defined functions gained permission support (see above). We now additionally check that a function is not a builtin before applying permissions.
  • WebAssembly functions are translated to machine code in a separate thread to reduce impact on the running system. The thread now runs with lower priority to reduce impact further.
  • The build system now automatically compiles Rust and C++ user-defined functions into WebAssembly. Previously, the compiled wasm text was inlined into individual tests.
  • User defined function permissions are now dropped when the keyspace containing them is dropped. #13820
  • Permissions are now checked for a user-defined aggregate (UDA) that uses user-defined functions (UDFs). #13818
  • We now disallow the CREATE permission on a named user defined function (UDF). It is meaningless since a named function must already have been created. #13822
  • The permissions granted when a keyspace or function are created have been adjusted. #13747
  • The WebAssembly runtime used to evaluate user defined functions has been updated to address vulnerabilities. #13807

New nodetool implementation - Experimental

The scylla executable can now act as nodetool by executing “scylla nodetool ”.

With time the new scylla nodetool will replace the legacy Java nodetool completely. The new implementation is fully backward compatible with the legacy nodetool.

The following commands are currently implemented:

  • compact
  • disablebackup
  • disablebinary
  • disablegossip
  • enablebackup
  • enablebinary
  • enablegossip
  • gettraceprobability
  • help
  • settraceprobability
  • statusbackup
  • statusbinary
  • statusgossip
  • Version

For the latest status of nodetool replacement progress see #15588

Object Storage Support - Experimental

You can store an entire ScyllaDB Keyspace to Amazon S3, or a compatible object store.

Enable the feature by:

  1. Enabling keyspace-storage-options experimental

–experimental-features=keyspace-storage-options

  1. Setup the S3 endpoint parameter and credentials
  2. Create a keyspace with S3 storage option:

CREATE KEYSPACE with STORAGE = { ‘type’: ‘S3’, ‘endpoint’: ‘$endpoint_name’, ‘bucket’: ‘$bucket’ }

See Keyspace Storage Options docs.

Note: at this phase, obj-storage is not ready for production, and should be used for testing only.

Note2: there is no ALTER support for the STORAGE parameter.


Much more on Part 2