Tradeoff between allow filtering and creating an index

Guy · December 26, 2022, 5:19am

My table has the following primary key: PRIMARY KEY((tenant_id, group_id))
If I have to query data based on tenant_id, I have two options:

select * from tablename where tenant_id=‘something’ ALLOW FILTERING
Pros:

No need to create an index(no extra materialized view is created)
Useful if there are fewer calls on the above query
Cons:
filtering the whole table

Create an index on tenant_id:
Pros:

No need for allow filtering
Cons:
Updating requires updating the index

Questions:

In #1 If I use part of the partition key with allow filtering will it scan the whole table (I want to know how filtering works when the partition key part is specified) If you can share any related post, that will be useful.
Is there any other approach that can be used?

*Originally asked on ScyllaDB’s community slack channel

Guy · December 26, 2022, 5:19am

Yes, despite using part of the partition key, this will result in a full scan and filtering. The reason is that Scylla hashes the partition key as a whole, and there is no way to infer all relevant hashes (tokens) from a partial key.
Not that I can think of. The choice is a tradeoff, and which one is best depends on how often you run this query. If it is very rare, the filtering is better. If this is regular, then I recommend creating an index.

The tradeoff is this: pay all costs at query time or spread the cost over all writes, each paying a small portion of it.

Guy · December 26, 2022, 5:19am

Understood. If I create an index on “tenant_id” and the table has records let’s say, in 1 lakh per tenant. Will this work properly with the paginated response?
As explained in the below post, it uses in clause internally.

Guy · December 26, 2022, 5:20am

Yes, paging works with indexed queries.

Topic		Replies	Views
Materialized Views and Indexing, filtering columns by range, ALLOW FILTERING ScyllaDB data-model , materialized-views , secondary-index	0	146	May 8, 2024
Local Secondary Index filtering ScyllaDB	9	742	November 29, 2023
Data modeling issues ScyllaDB data-model , open-source	4	263	March 25, 2024
Querying by non partition key column, creating an index ScyllaDB data-model , secondary-index	0	117	March 31, 2024
Data Modeling question, choosing the Partition Key, Clustering Key, Indexing and performance impact ScyllaDB data-model , cql , secondary-index	0	171	May 22, 2024

Tradeoff between allow filtering and creating an index

Related topics