Originally from the User Slack
@Dylan_Piette: Hello to the scylladb team, I’m currently facing the issue of needing to know all the current partitions in the db in a performant way
Is there any recommandations on how to do it ? I need to know the values of the pk, not just the count or something like that
https://github.com/scylladb/scylladb/issues/2066
GitHub: Really slow select distinct · Issue #2066 · scylladb/scylladb
@Felipe_Cardeneti_Mendes: https://www.scylladb.com/2017/03/28/parallel-efficient-full-table-scan-scylla/ there’s some boilerplate Go code in the relevant github repo, it originally just counts IIRC - but should be straightforward to iterate and save values.
@Sven: @Dylan_Piette For years we have just used SELECT DISTINCT our_partition_key_column_name FROM our_table_name
and it has worked without problems. However, note that we do not have a big database. In our case, the result set to this query within any keyspace contains at most 100000 partition key values, which in our case are UUID
s.
@Piette_Dylan: Well in our use case we have millions of partitions and the data is volatile, a partition can be gone in a few seconds (we deleted all the rows for that partition)
So we need a fast way to get the most accurate representation of what is actually inside the table
The solution @Felipe_Cardeneti_Mendes linked is a great one right now, after adapting the code I’m now able to query all the partitions 6x faster which is already great !
But I can’t help wishing for a near instant way to get this data haha