Hi, is there a (cheap + fast) way to get the approximate number of rows of a table?
I know there’s e.g. ‘Number of partitions (estimate)’ of tablestats.
I’d like to be able to be able to query via CQL, as an light-weight alternative to select count(*) from my_table.
In general there is no good way to do this in a way similar to compaction statistics. While we can count the rows in an sstables, those rows could overlap the rows in another sstable (so we’d count them twice), or could overlap a tombstone in another sstable (and so should not be counted at all).
Starting with ScyllaDB 5.1, SELECT COUNT(*) FROM tab is automatically parallelized across all nodes and shards. In conjunction with Consistency Level LOCAL_ONE, this is much faster that before, but still requires significant CPU and I/O resources.