Materialized Views and Indexing, filtering columns by range, ALLOW FILTERING

Guy · May 8, 2024, 3:42am

Originally from the User Slack

@Artem_Golovko: Hello everyone. I’m a new guy using scylla and just play around watching the scylla university courses. Would love if someone could help me with my confusion.

I have a simple table:
 CREATE TABLE IF NOT EXISTS sensors_data(
   assetId text,
   sensorId text,
   receiveTime timestamp,
   sourceTime timestamp,
   value blob,
   PRIMARY KEY ((assetId, sensorId), receiveTime)
 ) WITH CLUSTERING ORDER BY (receiveTime ASC)
I want to answer on the queries by receiveTime or by sourceTime, so for that I have created a local index.
CREATE INDEX ON sensors_data((assetId, sensorId), sourceTime);
If I describe the keyspace the index under the hood uses the following MV
CREATE MATERIALIZED VIEW sensors_data_sourcetime_idx_index AS
    SELECT assetid, sensorid, sourcetime, receivetime
    FROM sensors_data
    WHERE sourcetime IS NOT NULL
    PRIMARY KEY ((assetid, sensorid), sourcetime, receivetime)
    WITH CLUSTERING ORDER BY (sourcetime ASC, receivetime ASC)
Then two queries:

By receiveTime obvious works, no questions.
SELECT * FROM sensors_data WHERE assetid = 'asset1' and sensorId = 'sensor1' and receiveTime > 1299038700000;
By sourceTime not working and requires filtering, but I can’t understand why it is so
SELECT * FROM sensors_data WHERE assetid = 'asset1' and sensorId = 'sensor1' and sourceTime > 1299038700000;
Despite that I can query index directly and then use IN clause (do things manually that scylla should do for me)
SELECT * FROM sensors_data_sourcetime_idx_index WHERE assetid = 'asset1' and sensorId = 'sensor1' and sourcetime > 1299038700000;
Would love if you can clarify that behavior. Thanks!

@Nik: Hey Artem as per my understanding of MV they are created internally by Scylla to optimise reads and get over the problem of out of sync during updation so as stated in this case a table can have multiple MV for faster reads ,schema alteration doesn’t happen on the base table rather a new MV is created for specific use case .

@Artem_Golovko: @Nik According to the documentation https://opensource.docs.scylladb.com/stable/using-scylla/local-secondary-indexes.html scylla should do 2 requests for me. First, request the index view to know primary keys of the base table and second request the base table with found primary keys. But it’s not working when I’m filtering indexed column by range, but it should work, because in that case indexed column is a clustering key

@Felipe_Cardeneti_Mendes: you may be interested in https://github.com/scylladb/scylladb/issues/5547

GitHub: Local indexes could allow range queries on indexed elements · Issue #5547 · scylladb/scylladb

@Artem_Golovko: @Felipe_Cardeneti_Mendes thanks!

Topic		Replies	Views
Local Secondary Index filtering ScyllaDB	9	765	November 29, 2023
Indexes, Materialized Views and Disk Usage in Monitoring ScyllaDB materialized-views , scylladb-monitoring , secondary-index	0	19	October 29, 2024
Tradeoff between allow filtering and creating an index ScyllaDB data-model	3	705	December 26, 2022
Issue in creating Mat view with aggregations in view itself ScyllaDB materialized-views	4	58	July 11, 2024
How alternator creates MV with 2 columns not present in PK of original table? ScyllaDB	2	197	April 30, 2023

Materialized Views and Indexing, filtering columns by range, ALLOW FILTERING

Related topics