Queries with millions of records and pagination in a distributed database

In our application, we have queries that may return millions of records. Is it possible to use pagination with ScyllaDB? Also, how does paging work with distributed databases, taking into account that the data is distributed on multiple replicas?

Yes, ScyllaDB supports paging.

In use cases like yours, when queries can return huge amounts of data and when the amount of data is not known in advance, it makes sense to use paging. Otherwise, different resource management issues might occur both on the client and on the database side. Generally speaking, it makes sense to use paging unless you have a good reason not to use it.

Paging in the context of database queries involves transmitting query results in manageable chunks or “pages” to prevent these resource management issues. The page size can be limited by the client. Here’s how paging works:

  1. The coordinator (the node receiving a query request from the client) identifies the list of replicas (nodes with the relevant data) to fulfill the query. If using token-aware drivers, the coordinator node will also be a replica node.

  2. All read requests are sent concurrently to the selected replicas, which execute the requests and return results to the coordinator.

  3. *The coordinator merges the results from replicas and requests a page worth of data across the entire read range. The read progresses linearly across partitions in token order. Once the page is full, the page is cut, be it between or in the middle of a partition.

  4. To resume the query on the next page, the database records the query’s interruption position in a binary “paging state” cookie, which is transmitted with every page to the client and back to the database with each page request. The paging state is opaque to the client and can store other query-related states.

  5. On the next page, the coordinator checks the partition range and drops already-read partitions based on the stored paging state. To improve performance, the replica’s state is kept between pages.

*The above steps 2 and 3 are a simplification. What actually happens is that there is only one data request (served by the local node if that is also a replica) and CL-1 digest requests. Only one replica will return data. All replicas participating in the requests receive the page size and are expected to process an identical amount of data. After all replicas have responded, the data and the digests are checked. If they match, the page is returned as-is. Otherwise, a read-repair is started.

To address these issues, queries are made stateful by preserving the query’s state between pages using an object called the “querier.” The querier is saved in a special cache, and on the next page, it’s looked up to continue the query from where it left off. Each query is assigned a unique identifier to ensure that the correct querier is used for each query.

You can learn more about paging large queries and see examples in the documentation:

There is more information about ScyllaDB drivers in the Using ScyllaDB Drivers course on ScyllaDB University, including some hands-on examples for different languages.

Also, see the relevant discussion about using Count Limit or Paging.