Originally from the User Slack
@Nilesh_Kumar: Hi everybody,
I am looking for some way to copy the scylla table partitions key data to s3. Spark is one way but within spark as well is there any optimisation I can do to scan faster with less resource consumption of scylla so it doesn’t impact the running system?
Any help or direction will be of great use.
Data Info - Table contains billions of partitions and per partition there is just one row. I am trying to take dump of all the partition key available in the table.
Thanks
@Felipe_Cardeneti_Mendes: Use the token() function to scan and limit concurrency as acceptable by your source system, add bypass cache to prevent polluting the cache. See https://www.scylladb.com/2017/03/28/parallel-efficient-full-table-scan-scylla/