Row cache for different fields in the same record

dobermangood · May 19, 2025, 9:18pm

SELECT email, username, age FROM users WHERE id = ? LIMIT 1
SELECT id, email, created_at, username FROM users WHERE id = ? LIMIT 1
SELECT city FROM users WHERE id = ? LIMIT 1

I have a few more similar queries in scylladb, different cases require different fields and I think that I do more performant by selecting only the necessary ones, instead of simply selecting the entire record with all fields for all cases.
But is it really more efficient to do this?

And at the scylladb level too, because it caches at the row level and doesn’t this break the cache by essentially caching the same record several times, only different fields?

Like If the row is not in the row cache, it is read from the disk.
If I have three different queries that read by id, but different fields, they will all create separate row cache entries, which inflates the cache and can reduce the hit ratio.

Gui_Nogueira · July 16, 2025, 2:24pm

Yes! Providing the required columns to be fetched for a given SELECT is more efficient because it reduces the processing overhead for columns you won’t use, as well as the server to client response payload (be wary of network costs!).

However…

This is where it gets interesting. Yes, ScyllaDB will always cache the whole row, despite only a limited subset of columns were provided in the SELECT query.

As we discussed above, providing the columns on SELECT is great to reduce serialization and deserialization as well as network overheads. But it doesn’t affect cache population and efficiency.

You are on the right track to make more efficient use of your cache. The next question you should look into is how often different groups of columns are used, especially in relation to each other.

If you can break most-used columns into their own table, and keep a separate table for columns that are less frequently used, you may improve your cache utilization. Especially if you have larger cells that are rarely used.

Topic		Replies	Views
Efficient use of cache with different numbers of columns on Select queries ScyllaDB data-model , performance	1	161	December 19, 2023
Select * with static columns efficiently ScyllaDB data-model	2	254	December 14, 2023
Scylla DB use case to replace very slow RDBMS Common Table Expression Knowledge Base data-model	6	621	July 14, 2023
Using IN in a query for a specific partition, is the entire partition fetched? ScyllaDB data-model , performance	0	32	July 23, 2024
Data modelling to replace Redis ScyllaDB data-model	2	857	February 9, 2023

Row cache for different fields in the same record

Related topics