Installation details
#ScyllaDB version: master
#Cluster size: 1
os (RHEL/CentOS/Ubuntu/AWS AMI): centos
I noticed that when a view update is generated, it determines whether to use create_entry
or update_entry
based on the existing
and update
data of the base. When both existing
and update
exist and are alive, update_entry
is called. update_entry
calculates the difference between existing
and update
. And create_entry
will directly create the entire update
data.
void view_updates::generate_update(
data_dictionary::database db,
const partition_key& base_key,
const clustering_or_static_row& update,
const std::optional<clustering_or_static_row>& existing,
gc_clock::time_point now) {
xxx
if (existing && existing->is_live(*_base)) {
if (update.is_live(*_base)) {
update_entry(db, base_key, update, *existing, now);
} else {
delete_old_entry(db, base_key, *existing, update, now);
}
} else if (update.is_live(*_base)) {
create_entry(db, base_key, update, now);
}
return;
}
xxx
Suppose there is a 3-node 2-replica cluster, where nodes A and B are two replica nodes of a certain data. The data of replica A node is 1 and 2, and the data of replica B node is 1 and 3. Then executing repair on node A will generate a new data of 1, 2 and 3. Then update_entry
will be used when generating a view. update_entry
will calculate the diff and get 3. At this time, the view table of node A previously had data 1 and 2, plus the newly generated data 3, so the data read is 1, 2 and 3.
But assume that the base table data 1 and 2 of node A fail to propagate the view for some reason, then we will read 3 when reading the view table. Assume that the index table has not executed the repair of this token during this period. Because the consistency level of reading the index table is ONE in Alternator. Read repair will not be triggered at consistency level ONE.
Since the index table of the scylladb database only supports eventual consistency in Alternator, this is not a problem. However, can we replace update_entry
with create_entry
? This can reduce the generation of these intermediate data. What is the purpose of using update_entry ? To save some space overhead?