Data modeling - application with highly varying loads, detecting large partitions

Guy · March 18, 2024, 5:43am

Originally from the User Slack

@Yukkuri: Hi, in case there is no information available to provide partitioning am I understanding correctly that hashing clustering keys and dividing hash space into artificial partition keys is the right approach?
In case when load/write amounts can vary drastically from instance of application to instance of application (from one row every few weeks to hundreds of rows per second), dynamic hash space partitioning may also be required, also in my use-case those insert rates can gradually grow for any instance of application over time (but probably never substantially decrease, just keep the level for a while).

My current approach is to hold some partitioning generation counters and whenever load on an application instance exceeds certain treshold - introduce new subsequent generation, with more hash space partitions, copying all the data to be redistributed alongside new partitioning scheme, scheduling all write operations while redistribution is in progress for both old and new generations, and after redistribution is complete, old generation partitions can be range-deleted.

Basically (domain, generation, hash_bucket), apid where domain, origin of those apids can be derived from apid and apid is hashed into 64-bit value H and hash_bucket is determined using H / (H_max/(2^generation))

My questions are
• is periodic querying of system.large_partitions for every node is a reliable/reasonable way to detect partitions growing outside current generation bounds, or could it be preferable to maintain own kind of metric regarding hash bucket fill per domain per generation?
• Is there anything to be aware about, maybe other approaches, some potential improvements?

@Felipe_Cardeneti_Mendes: yeap, large_partitions is quite effective. Note it is NOT immediate however. large_partitions are populated as part of compactions. large_rows and other large_* tables may be relevant for you.

I think the only aspect I don’t understand is why you would copy data around from one place to another. Why don’t you simply keep track of your generations/buckets under a separate table?

@Yukkuri: > yeap, large_partitions is quite effective. Note it is NOT immediate however. large_partitions are populated as part of compactions. large_rows and other large_* tables may be relevant for you.
Thanks.

Why don’t you simply keep track of your generations/buckets under a separate table?
That would require keeping track of every apid from past generations there, which could end up being subject to own partitioning pressure; there is no way to know apid’s generation without full-scan of that accounting table.

Say there is apid <https://example.com/object/foobar> ; and under current generation 0 allowing only one hash bucket (whole hash space) it’s hash_bucket is 0.

Now if we increase generation to 1, allowing 2 partitions of hash space, it’s hash_bucket may end up being either 0 or 1 depending on chosen hash function, despite having exact same hash value. To account for this, we must copy a row of (('<http://example.com|example.com>', 0, 0), '<https://example.com/object/foobar>') to (('<http://example.com|example.com>', 1, $new_hash_bucket), '<https://example.com/object/foobar>') so we can continue relying on hash space partitioning to unambiguously map apid hash values to partitioning buckets. That requires extra dance around current/pending generations, to keep both in sync while migration is in progress, but otherwise seem to work.

Also one small change from what I described initially: it turned out to be beneficial to store combined generation+hash_bucket value in single column; since it also allows in-queries. In my case both values are fitting nicely in 128-bit uuid.

@Felipe_Cardeneti_Mendes: LGTM

Topic		Replies	Views
How to model my data when having subgropus of data with vastly different volumes ScyllaDB data-model , large-partitions , hot-partition , data-types	0	20	December 9, 2024
Data model with small partitions ScyllaDB data-model	0	102	March 24, 2024
Data model for frequent deletes with partition key ScyllaDB data-model	7	89	August 8, 2024
Handling large partition and full scan query when operating ScyllaDB ScyllaDB data-model , performance , unanswered , large-partitions	1	17	July 15, 2025
Using a clustering key, impact on performance, data distribution and partition size ScyllaDB data-model , performance , large-partitions , hot-partition	0	59	June 24, 2024

Data modeling - application with highly varying loads, detecting large partitions

Related topics