[RELEASE] ScyllaDB Enterprise 2024.1.5

The ScyllaDB team announces ScyllaDB Enterprise 2024.1.5, a bug-fix production-ready ScyllaDB Enterprise patch release for ScyllaDB Enterprise 2024.1 LTS Release.

Improvements in this release:

  • Alternator Performance improvements.

Varies optimization improved the throughput of the Alternator by up to 35%, depending on the workload.

  • This release adds Workload Prioritization to Alternator, ScyllaDB’s Amazon DynamoDB-compatible API. To use Workload Prioritization, one needs to enable alternator_enforce_authorization in the configuration. Read more about this feature here

  • Repair: Introduce small table optimization #15974 #16011

    On a cluster with multiple datacenters, with high latency between them, repairing small tables, like system_auth can take unreasonable time.

    The root cause is token range based repair, which is useful for large tables, generating a huge overhead when repairing small tables. This optimization fixes this issue by avoiding token range based repair for small tables.

Related Links

The following issues are fixed in this release (with an open-source reference, if available):

Alternator

  • Alternator: converting large items to text representation will stall #18806

Tracing

  • Security: A new log message prints out the Audit status for the cluster.
  • Tracing: the ability of figuring which shard owns a partition from system.large_partition by analyzing a corresponding sstable id was broken when moving to UUID based SSTable numbering #18381 Instead, one can use scylla sstable shard-of to extract shards which own the specified SSTables. #16343 #18440

Monitoring

  • Monitoring: Scylla 2024.1 enables per-table metrics by default on a per-instance level. Some of the metrics, like cache_hit_rate, latency and live_disk_space were not reported, and missing from the Monitoring Table Dashboard. #18642
  • Alternator: Scylla Alternator histogram to use Summaries. Alternator reports multiple histograms for operation latencies. Instead it should report a histogram per node and a summary per shard. #12230. To show this Alternator metric, use Scylla Monitoring 4.7.2 or later.

Performance

  • Performance: In tables with many small partitions (or many partition tombstones), sstable index pages can contain many entries. They are now destroyed gently to avoid stalls. #17605
  • repair: add control for repair percentage for partition count estimation #18615. The new parameter is repair_partition_count_estimation_ratio. The default 10% has not changed.

Stability

  • Stability: clustering_range intersection() can cause an infinite loop #18688
  • Stability: direct failure detector: make ping timeout configurable and increase the default, from 300 to 600 ms #16607. Failure detector is an internal inter-node mechanism.
  • Stability: fromJson() or INSERT JSON fails to set a map<timeuuid, int> #18477
  • Stability: handle_paxos_accept() fails to record a trace message when done with handling #18725
  • Stability: mutation_fragment_stream_validating_filter doesn’t respect validation_level::none #18662. The issue might cause false-positive validation errors during repair/streaming, leading to aborting the operation.
  • Stability: Scylla crash when reading from mutation fragments with token() filtering #18637
  • Stability: utils::chunked_vector fill constructor is exception unsafe #18635
  • Stability: Wrong exception is printed in build step exception handling #18423

Bloom Filter

  • Performance: ScyllaDB estimates a compaction’s partition count in order to correctly size the Bloom filter. It will now improve the estimate for garbage collection SStables. #18283
  • bloom-filter: default value for components_memory_reclaim_threshold is too strict #18607. Updated from 0.1 to 0.2 in the fix.
  • Stability: ScyllaDB drops Bloom filters when they use up too much space. It will now reload them when space is available again (for example due to compaction). #18186

Other fixed issues

  • Image: while dropping openssh-server from the ScyllaDB image, other system packages used by ScyllaDB scripts were also dropped.#17787
  • Config: default task_ttl_in_seconds is 0, but scylla.yaml changes the value to 10. #16714
  • Tools: Update tools/cqlsh submodule to v6.0.17 #18652
  • CQL: The algorithm for picking an index for a request is not always deterministic #7969