Question:
According to the official documentation, among the restrictions on counter columns in ScyllaDB
are:
The only other columns in a table with a counter column can be columns of the primary key (which cannot be updated). No other kinds of column can be included. This limitation safeguards correct handling of counter and non-counter updates by not allowing them in the same operation.
All non-counters columns in the table must be part of the primary key.
Counter columns, on the other hand, may not be part of the primary key.
In my table design I have hundreds of rows where each row needs to be linked to a counter column.
The table has the following structure:
K1 K2 C1 C2 C3 V1 V2 V3 COUNTER
Where K
is the partition key, C
is the cluster key, V
is a value and COUNTER
is a counter column.
Generally, the above table is queried as follows:
SELECT * from table WHERE K1 = A & K2 = B & C1 = X & C2 = Y & C3 = Z.
This results in about 500 rows being returned per query. As seen, each row is linked to a COUNTER column. Specifically, each combination of K1, K2, C1, C2, C3
is linked to a different COUNTER value.
How am I suppose to model this table, if the COUNTER has to be moved to an entirely different table?
If I understand correctly, the only way to do this, is to define another table → table_counter
without any of the values (V
):
K1 K2 C1 C2 C3 COUNTER
However, I have several issues with this approach:
1) It seems extremely inelegant to break up a logically cohesive table like this
2) It means that whenever I want to execute the previous query I would essentially need to execute two queries, instead of one, just to get the counter information linked to each K1 K2 C1 C2 C3
combination
3) I would also be forced to combine the results of the two queries above into a single data structure on the client side (for it to be useful)
Is this correct? If yes, is there an alternative to COUNTER column where I could add the COUNTER column to the first table?
One approach I was thinking of is to use a regular INTEGER
as the counter column. Whenever the counter column needs to be updated I can read the current counter (integer) value and increment it on the client side and then write the new value back to the database. I understand that I won’t be protected from concurrent reads/writes so that if two clients read the counter value at the same time, and they both increment/update it, one write (update) will be lost (e.g, only the last write will be preserved). However, I can live with the occasional lost write as we are not tracking anything critical where a single (or handful) of lost writes will make a major difference. I also understand that this would require a read and then a write (two operations) each time I want to update the counter column, however, it allows me to keep the counter column as part of the original table as well as reduce the querying from two tables to one. Additionally, there would be no need to combine the results of querying two tables on the client side with this design. Seems more efficient than using a counter column and much more elegant.
Is this approach a viable alternative to the COUNTER column? Are there any pitfalls I missed? Is there another approach that might work better in my example?
Thanks