Calculating "per record" memory in a cluster

Guy · November 14, 2022, 8:55am

I have been tasked to come up with “some number” of how much memory it takes to store one record in Scylla (and related overhead if any). We had commands in Redis Cache that produced these metrics. Is that even possible to do in Scylla? I have scyllatop running and also the monitoring stack.

Guy · November 14, 2022, 8:56am

One way to accomplish this task is to create a million “average” records, execute “nodetool flush” to make sure they get flushed to disk (which will also place them in cache), then look at the cache metrics and divide the cache memory usage by the total rows in cache.

Guy · November 14, 2022, 8:57am

Thanks, a couple of follow-up questions. When you say “cache metrics,” do you imply ones provided by scyllatop ? I presume this is where I get cache memory usage from? Also, would that work in a clustered environment (I have 8 nodes spread over four DCs). I know cache is over cluster, but what happens if not all records are in the cache? How would I find out how many have been put into cache?

Guy · November 14, 2022, 8:57am

The metrics are available by scyllatop, though a nicer way to see them is with Prometheus or Grafana (see scylla-monitoring.git). The metrics contain the number of partitions in cache, rows in cache, and memory in cache, so from there you compute any statistic you want.
About the rows not in cache, well they don’t have any cache footprint.

Topic		Replies	Views
Cassandra Vs ScyllaDB Memory Usage ScyllaDB cassandra , performance	2	766	December 13, 2022
95% memory usage scyllaDB even when idle ScyllaDB testing	3	468	April 13, 2024
Latency issue and data retrieval, page size and number of threads (Java Driver) ScyllaDB data-model , java-driver , paging , latency , threads	0	46	December 23, 2024
What is the maximum number of records that a scylla table can carry? ScyllaDB	3	1517	June 8, 2023
Missing nodes in the Monitoring dashboards, lack of resources? ScyllaDB performance , scylladb-monitoring	0	12	July 28, 2024

Calculating "per record" memory in a cluster

Related topics