[RELEASE] ScyllaDB Enterprise 2023.1.3

The ScyllaDB team announces ScyllaDB Enterprise 2023.1.3, a bug-fix production-ready ScyllaDB Enterprise patch release for ScyllaDB Enterprise 2023.1 LTS Release.

2023.1.3 patch release includes multiple minor bug fixes.

You are encouraged to upgrade to it in coordination with the ScyllaDB Support team.

Related Links

The following issues are fixed in this release (with an open-source reference, if available):

  • CQL: wrong format of the timestamp data type in toJson function. The JSON format is now Apache Cassandra compatible. #14518, #7997
  • CQL: Reject LIMIT without any value #14705
  • CQL: fromJson() fails to set a map<ascii, int> value, when parsed from its JSON representation, did not parse the key correctly. #7949
  • Correctness: a rare combination of reconciliation (read repair) with reverse queries and range tombstones, could cause incorrect data to be returned from queries #10598
  • Correctness: Row cache updates do not provide strong exception safety guarantees. In a rare case, and when using CL=1, cache might return a stall value (very rare) #15576
  • ADMIN REST API: GET /storage_service/native_transport should only return true after the server’s been started #12376
  • Offline Installer: failed with “implausibly old time stamp” error #13415
  • Install: off-line (air gapped) install on OEL7.6 failed
  • Stability: ICS cross-tier tombstone compaction can be delayed indefinitely and doesn’t respect ‘tombstone_compaction_interval’
  • Stability: seastar bug can cause aborted connection to loop, consuming 100% CPU. Among others, the issue can manifest in LDAP, nodetool drain. #12774, #7753
  • Stability: assigning position_in_partition is not exception safe, can lead to incorrect data during memory stress #15822
  • Stability: Exceptional future ignored in Task manager: seastar::gate_closed_exception #15211
  • Stability: failure detector apis need to call gossiper on shard 0 #15816
  • Stability: ScyllaDB contains two classes of tables, system and user, and uses separate memory pools for their memtables. This avoids a deadlock when a user memtable is being flushed, and needs to allocate memtable space for a system table as part of the flush process. We now automatically designate all system tables as using the system memtable pool. #14529
  • Stability: migration_manager: schema version correctness depends on order of feature enabling #16004
  • Stability: nodetool enablebinary starts the CQL server in the streaming group, instead of statement group #15485
  • Stability: nodetool resetlocalschema should recalculate per-table schema digest #15380. With this change one can fix issues in Schema digest, like #4485, without a rolling restart.
  • Stability: Overloading scylla with materialized view writes can lead to deadlock #15844
  • Stability: A rare crash when a SERVICE LEVEL is dropped #15534
  • Install: scylla_post_install.sh: “[ $RHEL ]” does not work for RHEL, it only detects CentOS #16040
  • Stability: A regression causing a crash on table drop #15097
  • Stability: reader_concurrency_semaphore: execution loop can stall #13540
  • Stability: a race condition when encryption at commit-log on disk might cause an unexpected exit.
  • Stability: Async functions yield while traversing the gossiper endpoint state map #13899
  • Stability: compound_view::explode() uses bad format string #14577
  • Stability: The “forward” service is responsible for execution of automatically parallelized aggregation queries. forward_service retries may block shutdown. #12604
  • Stability: hints: send_one_hint: file_send_gate holder is released too early #15110
  • Stability: maybe_fix_legacy_secondary_index_mv_schema() could reference dangling pointer #13720
  • Stability: row_cache::row_cache() isn’t exception-safe #15632
  • Stability: RPC wrappers, pass arguments by const ref #12504
  • Stability: A regression in IPv6 address formatting, which caused nodetool problems, like breaking when there is an Alternator GSI in the database #16153, or cause a node to be stuck with “?U” status and Host ID is "null #16039
  • Stability: the sstable parser is now able to detect more types of corruption involving premature end-of-file with compressed sstables. #13599
  • Stability: Use statement might throw an exception instead of returning an exceptional future, for example if the keyspace is doesn’t exist #14449
  • Tooling: Old version of node_exporter (1.6.1), updated to 1.7.0 #16085
  • Tooling: update Java tooling, including new versions for
  • Tooling: tools/scylla-sstable: dump column_desc as an object instead of list of string, make it more readable #15036
  • Performance: Off-strategy compaction improves compaction for run-based compaction strategies reducing temporary storage requirements. #14992.
  • Performance: Optimization for avoiding bloom filter during compaction was reverted by #14091
  • Performance: Repair-based node operations (RBNO) write SSTables with useless filters as a result of not feeding correct key estimation #15748. In 2023.1 only the replace-node operation uses RBNO by default.
  • Performance: High latency in Repair as part of RBNO bootstrap. In 5.2 RBNO is used by default for Replace node operation. #15505. The issue is a regression since 2023.1.0
  • Performance: Repairing a cluster after a restore causes severe reactor stalls throughout the cluster (due to expensive logging within do_repair_ranges() without yield) #14330
  • UX: Misformatted printout of column name in LWT error message #13657
  • UX: remove unnecessary warning: “sstable - Could not remove table directory …/snapshots” … Directory not empty #13538
  • Monitoring: metric name issue: scylla_database_reads_memory_consumption shows free memory for the system semaphore #13810
  • EaR: A few fixes to cluster wide Encryption at Rest added in 2023.1.2.
  • Build: docker upgrade all 3rd party packages on creation #16222