According to the documentation on local secondary indexes, the read path for token-unaware queries involving a local secondary index (LSI) are not completely satisfied locally. Instead, the keys retrieved from the LSI on a given node are sent back to the coordinator node, which sends a request back to the node to retrieve the data associated with those keys.
If the data which an LSI on a given node references also exists on the same node, why does the coordinator node need to be involved at all in the retrieval of such data on that node?
A token-unaware query using a local secondary index (LSI) in ScyllaDB requires a round trip—even when the data referenced by the LSI is colocated on a single node—due to the way query coordination and result gathering is structured in ScyllaDB.
Even though LSI guarantees that both the index and the corresponding base table data reside on the same node (because the index shares the base table’s partition key), the step-wise query process still involves a coordinator node. For a token-unaware driver (i.e., when the client does not direct the query to the correct node), here is what happens:
- The coordinator node receives the query, but it may not be the node that holds the relevant data (since it’s token-unaware).
- The coordinator sends the query to all relevant nodes (possibly more than one due to replication).
- Each node with a matching index entry returns the primary keys (base keys) for matching data to the coordinator.
- The coordinator processes these keys and issues a subsequent request to retrieve the actual base table data associated with the returned keys; this may result in another request back to the same node that initially found the index entry, creating a “round trip.”
Why can’t the initial node simply serve the data immediately?
- The index query only identifies base keys—the coordinator is responsible for assembling the final result by fetching row data for these keys.
- This indirection (coordinator round trip) keeps the logic consistent for all index queries, whether local or global, whether the index and data are colocated or not.
- It allows the coordinator node to apply consistency checks, filtering, aggregation, or paging uniformly.
- The query flow has not been optimized to allow a node to “short-circuit” the process and retrieve the data locally in the initial pass for token-unaware queries.
Optimization is possible with token-aware queries:
- If a token-aware driver is used, the client sends the query directly to the node that owns the necessary token (partition), making that node both the coordinator and data holder.
- In this scenario, the node can process the index and base table read internally—no network round trip is needed.
In short:
A token-unaware query needs a round trip because ScyllaDB’s query flow separates index lookups and base data retrieval, always routing through the coordinator node, even if the queried node (which executes the index lookup) could have served the data itself. This path is necessary for consistency and feature uniformity, though it can be avoided with token-aware clients.
Check also this great blog post for more details,
Gabriel
1 Like