In JanusGraph we are currently working on adding a new feature to allow users execute SELECT
query for multiple partition keys using IN
operator.
Usually IN
operator usage for different partitions is an anti-pattern (I think that’s why there is a default protection exists in ScyllaDB max-partition-key-restrictions-per-query
: Scylla FAQ | ScyllaDB Docs ).
However, on JanusGraph side we want to group those partition keys based on either Token Ranges only (for Cassandra
/ AstraDB
/ Amazon Keyspaces
) or based on Token Ranges plus Sharding information.
We often need to fetch specific columns for many partition keys together. As for now we are executing a separate asynchronous CQL query per each partition key. Often times it leads to channel conjunctions when we need to select those columns for let’s say thousands of keys which leads to thousands of CQL queries.
We found out that grouping keys by limited groups of keys (let’s say grouping 100
partition keys via IN
operator) which belong to the same token range (i.e. keys which leave on the same Nodes) often results in slightly better performance on Cassandra side, as well as making some serverless deployments like AstraDB, where you pay 1 RRU for each CQL query which returns 4 KB of data way cheaper for running JanusGraph, because usually JanusGraph queries small amount of bytes which is much less than 4 KB.
We were able to make the implementation (see here: Group CQL keys into a single query [cql-tests] [tp-tests] by porunov · Pull Request #3879 · JanusGraph/janusgraph · GitHub) which groups partition keys based on Token Ranges. I.e. we convert a partition key into a Token and then search for it’s TokenRange
using TokenMap
. Any partition keys which belong to the same TokenRange
are grouped together (into small groups, so that we have a chance to touch multiple replicas, but let’s omit replicas existence for simplicity and assume each TokenRange has only 1 Node).
The implementation works good for Cassandra, but we would like to have an improved implementation for ScyllaDB, because ScyllaDB routes queries not only by a RoutingToken / RoutingKey, but also by the Shard Id
of that RoutingKey
. Meaning that the request is routed directly to the correct CPU for processing.
As we group multiple partition keys together via IN
operator, we have a small challenge of: “How do we determine if two partition keys, which belong to the same TokenRange are going to be routed to the same Shard?”.
One way I’m think of is simply getting all nodes for a partition key and if for two different partition keys ShardId matches for ALL the nodes - than it means that they can be grouped together. The problem is that I don’t know how exactly get ShardId if all I have is CQLSession
, Node
, and partition key
.
If anyone knows how to properly get Shard Id (or shard infomration) using Node
+ partition key
via ScyllaDB Java Driver, it would be really great if you can share that.
Any other concerns and / or suggestions are highly appreciated.