What's the best/fastest way to figure if a table has any data in it?

Assuming I need to pick all the tables with data in them (at least one row)

What’s the best/fastest way to figure that out ?

naive approach would be getting all keyspaces and tables names
SELECT keyspace_name, table_name, compaction FROM system_schema.tables

and run this on each:
SELECT * FROM {keyspace}.{table} LIMIT 1

it might become a bit problematic when there’s lots of tombstones involves, and the harmless looking query might become quite slow or event timeout.

is there a better way getting this information ?

If by “tables with data” you mean tables with at least one live row, then there’s no other way than running a query. You cannot avoid processing tombstones, as they may affect liveness of a row.

You also have to reconcile writes from all replicas, so you should run the query with CL=ALL. A row may be live on one replica, but a tombstone from another replica may cover it.

2 Likes