Consistency level Issues

Hi,
My cluster is a multi-dc setup with 4 nodes and a replication factor (RF) of 2. I’m using the default consistency level (ONE) for both read and write operations. However, for a very small portion of reads (less than 0.1%), I’m experiencing inconsistencies where the same request sometimes returns a null value and other times a non-null value.

Although I am using consistency level ONE, I expect scylla to be eventually consistent. But I am experiencing this inconsistent behaviour in reads even after 5-6 hours of writing the data. Is there a max time after which read consistency can be guaranteed? And what is the recommended consistency level for production systems? My use-case requires the data to be eventually consistent after 1-2 hours of writes.

Are you using RF of 2 per DC?

With CL=1, if a write request (mutation) got lost somewhere between the coordinator and a replica, the coordinator will still return a success ack based on the one replica that did succeed.
This means a replica still needs to be updated; there is no way to know that. In this case, a Read with CL=1 might get a stalled value.

Only a repair operation will bring all the replicas back into sync.
You can mitigate this issue using RF=3 and CL=QOURUM for reads and writes.

More here

1 Like