Last week in scylladb.git master (issue #185; 2023-06-25)

This short report brings to light some interesting commits to scylladb.git master from the last week. Commits in the b7627085cb…be5b61b870 range are covered.

There were 162 non-merge commits from 20 authors in that period. Some notable commits:

The sylla sstable tool now supports the scrub operation, enabling offline (and off-node) scrubbing of sstables.

Recently, schema changes to data in the row cache changed the upgrade granularity from partition to row, to prevent stalls when large partitions are cached. One place could use an outdated schema, which could cause a crash. This is now fixed.

The S3 storage back-end now limits the number of open connections, in order to avoid S3 dropping connections on overload.

An edge case where querying a datacenter that has replication factor equal to zero could lead to a crash has been fixed.

The documentation URL has been changed to

When performing the last-write-wins rule comparison, if the timestamp of the two versions being compared was equal, ScyllaDB first compared the cell value and then the expiration time (TTL). This is compatible with earlier versions of Cassandra. However, this could cause a NULL value to appear if the cell was overwritten with the same timestamp but a different TTL. The algorithm was changed to compare the cell value last, and check all the other metadata first, resulting in fewer surprising results. It is also compatible with current Cassandra versions.

Bugs preventing a node from starting when using the new raft-based topology mechanism have been fixed.

Tablets are a new, experimental replication model in ScyllaDB, contrasting with vnodes. Tablets are now restricted to a single shard, unlike vnodes which span all shards on a node. This simplifies how tablets are stored in sstables and how tablets can be migrated to other nodes.

SSTable compression can be configured with a chunk size, with larger chunks trading less efficient I/O and higher latency for higher compression ratios. The chunk size is now capped at 128 kB, to avoid running out of memory.

When a node is decommissioned or forcibly removed, Raft will now ban it from communicating with the cluster, to avoid a the removed node from affecting the cluster.

ScyllaDB can automatically parallelize certain aggregation queries. The mechanism however had a bug when aggregating columns that had case-sensitive names. This is now fixed.

A crash when DESCRIBE FUNCTION or DESCRIBE AGGREGATE were used on the wrong function type was fixed.

See you in the next issue of last week in scylladb.git master!