Installation details #ScyllaDB version: all #Cluster size: os (RHEL/CentOS/Ubuntu/AWS AMI):
Recently I read repair code about row_level_epair, and I see it has a function called read_rows_from_disk, then, I am wondering will it read data from memtable? Or only sstable ?
Here is an example, when there is a five-node cluster with 3 replicas, consistency is quorum, and it is doing repair, considering at the same time a write is settled to disk(sstable) on node-1, this write is still on memtable of the other two nodes. So, in this situation, how repair reads data ?
If repair only reads data from sstable, will repair thinks that write is missed on the other two nodes?
Thanks a lot for your reply. And I see data is read to a cache with a limit value, then do a hash for it. So, if an item is stored in several sstable, how it compare the diff?
Data from different SSTables and Memtables are merged into a single unified stream, this is what is stored in the repair buffers. This buffer is then used to compare against the unified stream on other replicas.