Handling large partition and full scan query when operating ScyllaDB

Installation details
#ScyllaDB version: 5.4
#Cluster size: 3 each DC1,2
os (RHEL/CentOS/Ubuntu/AWS AMI): CentOS

Hi all — I’m using ScyllaDB to store transactional data and had a few questions around query patterns and cluster impact.

  • We need to fetch 1-day range of data, which can be hundreds of thousands of rows.
  • There’s a chance of large partitions. Very few. Some 0.001% user have about 630k trasactions.
  • A single query timing out is okay — but we must avoid impacting other services due to one heavy scan.

So far I’m thinking:

  • Bucketing to reduce hot partitions, but read burden may still be high.
  • LIMIT doesn’t help much if hot partition happens anyway.
  • Pagination could help — but unclear if it prevents cross-service impact.
  • Maybe Materialized Views?

I’d love to hear your experience on:

  • How do you monitor or alert for large partition reads in production?
  • How do you prevent the Full scan query like Count query and large partition query?
  • What strategies do you use to contain the impact of large queries — whether at the app level or through cluster tuning (e.g., read concurrency, timeouts, coordinator limits)?

Has anyone tackled similar patterns or have suggestions from an app-level or cluster tuning perspective?
Thanks a lot!