Originally from the User Slack
@Daria_Fedorova: Hi! I need some help (maybe scylla university lesson link)
I noticed that documentation on nodetool rapair recommends option -pr (–partitioner-range) but I do not understand what it means.
Web doc describes it like this
--partitioner-range
executes a repair only on the primary replica returned by the partitioner.
But at my level of understanding partitioner is basicaly a hash function partitoner key to token. I do not know what is a primary replica in this context
@Botond_Dénes: With the typical RF=3 replication factor, for each partition-range you have a primary replica and 2 additional replicas. The primary replica is simply the first in the list returned by get_replicas_for_token(t)
. The ordering in this list is stable.
The primary replica is not special in any way, that said there are some cases like repair -pr
where it has a role to play.
If you have a 3-node cluster and RF=3, calling nodetool repair
on each node, will repair all data 3 times, because nodetool repair
repairs all ranges that the node is a replica of. With nodetool repair -pr
only ranges for which this node is a primary replica are repaired. Since there is just one primary replica for each range, using -pr
guarantees that each range is repaired exactly once, if you call nodetool repair -pr
on each node in the cluster.
@Daria_Fedorova: thank you