ScyllaDB and Large Partitions

Are large partitions still an issue and if so, how can I deal with it?
What is the maximal partition size?

Large partitions are an anti-pattern in Scylla - one should aim to have data spread evenly among the partitions in the cluster.
The presence of large partitions create issues such as higher shard latency (because it has a hot spot for data) or oversized allocation warnings/errors on logs.
In any case, you can use Scylla’s system.large_partitions virtual table (doc) to help you track them, along with log lines that indicate their occurrence.
Check this Scylla University lesson covering this topic.

As to how avoid them on the first place, you should wisely plan your data modeling in a way that leads to a high cardinality distribution of data among partitions. Check this blog post for some ideas around that.

1 Like