What is the maximum number of records that a scylla table can carry?

Felix-zhoux · June 8, 2023, 2:21am

Hello, guys!

If I put 1 trillion records in one table, each record is a separate partition, is this a good schema design, what problems will exist?
In other words, in a large-scale cluster, do we need to control the number of partitions for a single table?

When I have a lot of small partitions, does it mean that the reading efficiency will be poor, for example, there will be problems in the construction of Bloom filters. How should I tune some table-level parameters for good performance, such as compaction strategy?

Botond_Denes · June 8, 2023, 3:47am

ScyllaDB loves many partitions, I would even say that many small partitions are the sweet spot for ScyllaDB. This is mostly because having many partitions will mean that data is well distributed among the nodes and shards in the cluster. Uneven data distribution leads to hot shards and hot nodes, which slow the cluster down.

On the other hand, having partitions that are tiny can have some side effects. You may find that not as many as them fits in cache as you would expect, as there is a constant per-partition size overhead for partitions stored in memory. View update generation during repair, for tables that have materialized views or secondary indexes attached, can get quite slow with many tiny partitions.

All that said, you should not see any problems with storing huge number of small partitions in ScyllaDB. Like I said above, this is the sweet spot for ScyllaDB and there is no limit (theoretical or hard) as to how many partitions you can have.

Choosing the compaction strategy is more of a question of what workloads you have, than how your data is organized.

Felix-zhoux · June 8, 2023, 6:47am

Do we have parameters to control this value?

Botond_Denes · June 8, 2023, 7:54am

No, this is just the overhead of the C++ objects involved in storing and organizing partitions. Nothing we can do about that.

Topic		Replies	Views
How Do Many Small Partitions Influence Memory Usage in ScyllaDB? ScyllaDB data-model , performance , bloom-filter	1	172	September 25, 2024
ScyllaDB and Large Partitions ScyllaDB	1	460	January 22, 2023
Limitations of tiny partitions ScyllaDB data-model , sizing , materialized-views , secondary-index , hot-partition	3	63	October 28, 2024
Maximal limit for the number of partitions and other partition recommendations ScyllaDB data-model , performance	0	35	January 6, 2025
Manual bucketing and large partitions problem ScyllaDB twcs , large-partitions	4	151	December 14, 2024

What is the maximum number of records that a scylla table can carry?

Related topics