View in #general on Slack
@Hartmut: Hi, which would be the recommended/preferred query pattern or anti-pattern?
(A vs. B)
CREATE TABLE test (
pk text,
val1 text,
val2 text,
PRIMARY KEY (pk)
);
INSERT INTO test (pk, val1, val2) VALUES ('a', 'foo', 'bar');
INSERT INTO test (pk, val1, val2) VALUES ('b', 'some', 'data');
INSERT INTO test (pk, val1, val2) VALUES ('c', 'scylla', 'db');
-- A) individual queries (in parallel, pooled)
SELECT * FROM test WHERE pk='a';
SELECT * FROM test WHERE pk='c';
-- B) `IN ()`
SELECT * FROM test WHERE pk IN ('a', 'c');
B) obviously isn’t shard-aware but needs to be orchestrated
I guess it may depend on the actual use case, how many rows are to be fetched and so on…
But still, I wonder if anyone has any experience or insights to share…?
@avi: Individual queries are generally better. You’ve moving some of the coordination from the server to the client, which is more easily scaled. The single IN query cannot be made shard/token aware, so you pay with an extra hop.
@Hartmut:
CREATE TABLE test2 (
pk text,
ck text,
val1 text,
PRIMARY KEY (pk, ck)
);
SELECT * FROM test2 WHERE pk='a' AND ck IN ('x', 'y');
On the contrary, when querying a specific partition, it should be perfectly valid, correct?
@avi: Yes, in this case IN is preferable