Row cache for different fields in the same record

SELECT email, username, age FROM users WHERE id = ? LIMIT 1
SELECT id, email, created_at, username FROM users WHERE id = ? LIMIT 1
SELECT city FROM users WHERE id = ? LIMIT 1

I have a few more similar queries in scylladb, different cases require different fields and I think that I do more performant by selecting only the necessary ones, instead of simply selecting the entire record with all fields for all cases.
But is it really more efficient to do this?

And at the scylladb level too, because it caches at the row level and doesn’t this break the cache by essentially caching the same record several times, only different fields?

Like If the row is not in the row cache, it is read from the disk.
If I have three different queries that read by id, but different fields, they will all create separate row cache entries, which inflates the cache and can reduce the hit ratio.

Yes! Providing the required columns to be fetched for a given SELECT is more efficient because it reduces the processing overhead for columns you won’t use, as well as the server to client response payload (be wary of network costs!).

However…

This is where it gets interesting. Yes, ScyllaDB will always cache the whole row, despite only a limited subset of columns were provided in the SELECT query.

As we discussed above, providing the columns on SELECT is great to reduce serialization and deserialization as well as network overheads. But it doesn’t affect cache population and efficiency.

You are on the right track to make more efficient use of your cache. The next question you should look into is how often different groups of columns are used, especially in relation to each other.

If you can break most-used columns into their own table, and keep a separate table for columns that are less frequently used, you may improve your cache utilization. Especially if you have larger cells that are rarely used.