I am learning Scylla, and trying out secondary indexes. I have found that local indexes are not working as they could.
What I do:
CREATE TABLE ks.t (pk int, ck int, v1 int, v2 int, PRIMARY KEY (pk, ck));
CREATE INDEX ON ks.t ((pk), v1);
// inserted data
SELECT * FROM ks.t WHERE pk = 1;
pk | ck | v1 | v2
----+----+----+------
1 | 1 | 1 | 6
1 | 2 | 0 | null
1 | 3 | 2 | 7
(3 rows)
SELECT * FROM ks.t WHERE pk = 1 AND v1 = 0;
pk | ck | v1 | v2
----+----+----+------
1 | 2 | 0 | null
(1 rows)
SELECT * FROM ks.t WHERE pk = 1 AND v1 > 0;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"
If I use ALLOW FILTERING, it does not use index. But it should.
But if I query the index table (MATERIALIZED VIEW) then it works:
SELECT * FROM ks.t_v1_idx_index WHERE pk = 1 AND v1 > 0;
pk | v1 | ck
----+----+----
1 | 1 | 1
1 | 2 | 3
(2 rows)
As far as I understand, it is not available for global indexes, because then you would have a lot of requests to other nodes. But why does it not work for local indexes? Would it be also slow? Should I create a full MATERIALIZED VIEW for such a case?
Would be very much usable with LIMIT.
Thatâs an interesting question. In fact, we even discussed it under Local indexes could allow range queries on indexed elements ¡ Issue #5547 ¡ scylladb/scylladb ¡ GitHub
As you can see, it is not yet implemented. However, even though it is possible to implement such a feature, notice that the discussed problem statement still remains: Since the index does not hold contents for all columns, a range scan will require looking up the base table. Depending on the scan (such as large partitions involving several clustering keys, as mentioned in the linked issue), the results may not be what youâd expect.
For example, in your provided output, notice how the index lookup with an equality clause returned all rows from the base table, whereas your view lookup misses the contents for column v2
.
You may still create a Materialized View including all (or only the relevant) columns, and then scan from it if thatâs what you need. Or, depending on your query patterns and data distribution, maybe ALLOW FILTERING
may turn out to be even better!
As you are learning, follows a great talk to guide you down the path: https://www.youtube.com/watch?v=Ds9jbTeW0ks
Hope it helps!
2 Likes
Thanks for your answer.
But I am still concerned with ALLOW FILTERING not using an index. My guess is that it would work faster that way, because itâs already sorted. Currently itâs a scan of all the rows for a selected part of a PRIMARY KEY.
I have checked that using a big table (1M rows), and having one indexed column filled only for 5 rows.
And selecting by that column ... AND v > 0 ALLOW FILTERING
in cqlsh
, gives me a lot of empty responses:
---MORE---
---MORE---
---MORE---
---MORE---
TRACING ON;
gives a better understanding.
Same quantity as if I would not filter by that extra column. Which means it scans all rows (a thousand rows for a selected part of a PRIMARY KEY).
I did write it also on Github you have mentioned.
But I am still concerned with ALLOW FILTERING not using an index.
No one said you shouldnât be, nor that ALLOW FILTERING
is a magic solution to all problems. The linked resource provided earlier should provide you a guidance when ALLOW FILTERING
makes sense, and when it doesnât.
I have checked that using a big table (1M rows), and having one indexed column filled only for 5 rows.
I think you meant a partition with 1M rows with only few columns matching your restrictions. That means youâve hit exactly one of the topics (and anti-patterns) covered in the reference material.
In summary: Currently, if your data distribution involves partitions with many rows AND your restrictions turns to waste most of the scanned results, then you may still create a view to accomplish what you are after. And more: You donât need to incur the full overhead of doubling your storage utilization, provided your view include only the columns you need. The only âdrawbackâ is that such scans will need to go through the view, rather than the base table.
Table with 1M rows , 1k per partition.
You donât need to incur the full overhead of doubling your storage utilization, provided your view include only the columns you need.
But what if I need all of the columnsâŚ
Quick example:
Table news
with fields id
, title
, published_at
, views_count
, text
.
And views_count
is always increasing (not unique).
How do I select 10 most viewed news
?
I would create an index, and select ordered with limit 10, as in Mysql, but Scylla does not allow to do it efficiently (without extra query and without disk space overhead (if I create a full materialized view)). Maybe I am missing something?
(maybe not the best example for usage of Scylla, but anyway)
In short: the problem is that we have an index table ks.t_v1_idx_index
which gives me what I need, and I have to receive needed ids, and request the main table ks.t
. But Scylla could do it efficiently without me sending it back (to the same node).
You would say: âbuy an extra storage and use MVs, it will be faster for Selectsâ.
But selects can be cached, and also imagine that I can have a lot of updates to this table, which will be very slow if I would have a lot of different MVs containing the same fields. Thatâs why I would prefer a small MV with keys only (which is secondary index).
I donât think you are correctly stating the problem at hand.
If you have 1K rows per partition, then the rest is irrelevant as you should restrict your partition when doing ALLOW FILTERING
. 1K rows is just fine (thatâs the default page size in most drivers, so you shouldnât receive many empty pages as you stated earlier. cqlsh
instead has a default page size of 100 - for obvious reasons) - although you donât want to find yourself in a position where you hit few rows out of the grand total very frequently.
Your provided example doesnât state the problem either. It actually gives me more concerns out of the main subject:
- How you increase
views_count
? Are you doing a read before write? Why not use a counter instead?
- What makes you think you would be able to derive a top-k resultset out of this? Spoiler: ORDER BY on a secondary index is not supported.
- Is this example anywhere close to what you actually want to achieve?
You are right that allowing an inequality clause on top of a LSI is doable, but I think you are missing the fact it is not a silver bullet solution, and it also carries its own trade-offs, and can fire back just as an ALLOW FILTERING clause would. A few:
- An inequality clause such as
> X
, where X is a lower bound entry spanning several clustering columns may be worse than an ALLOW FILTERING
, given that it will involve several lookups back to the base table.
- Such back and forth may have require reading tombstones along the way, further degrading the read path
Regardless, I think that by now you should understand and have plenty of resources to further understand the implications of each choice at hand. Feature requests should be directed to GitHub, and we provided reference material for you to assess, understand how each solution at hand impacts performance, as well as advice on some of the options you may use.
For example once an hour. Counters require separate table. But I anyway need to read it before write. Not a simple increment. And I have only one writer, so it must be fine.
Is it not already sorted? Iâm confused⌠I thought thatâs the point of an index. But index table (MV) is sorted, and allows for X > 0
, which means itâs not a hashmap. You mean that, the result will not be ordered as itâs in that MV. Hm⌠thatâs interesting.
Not actually. But it has that ordering column I want to use. Actually I want multiple ordering columns which are not unique and can be updated.
If I have an MV with the same partition key as the main table, will it be stored on the same node?
And also, can I have an MV to store only top 10 rows per partition? Like by ck
in a descending order.
And also tricky thing would be to combine multiple tables into one using something like MV. Is something like that possible? Because I am not sure if I must store a big set of data in a map
field, keys of which I also want to use as index, so I would prefer a separate table for that, and have it combined with the main table for one of my queries.
And also I would expect the secondary index to work with SELECT * FROM ks.t WHERE pk = ? AND v1 IN(?)
.
For example v1 IN('', 'something')
(where empty string, or âsomethingâ).