Is there a way to know in advance if my data will be available in case of node failure, with different values of Replication Factor (RF), Cluster Size, Consistency Level (CL), and so on?
The short answer is that you can use the Consistency Calculator.
To elaborate, high-availability (HA) databases, such as ScyllaDB, aim for continuous operation, allowing applications to remain “always-on” despite failures by leveraging architectural designs to ensure data availability.
ScyllaDB achieves high availability by eliminating single points of failure, implementing failover mechanisms, and being topology-aware. Being topology-aware means that even if an entire rack or datacenter fails, there is no downtime.
Data is automatically replicated across multiple nodes. For example, a Replication Factor (RF) of three (RF=3) means that each piece of data is replicated to three different nodes.
Different tradeoffs apply between performance, consistency, availability, and partition tolerance. Also, see the related PACELC theorem.
The above calculator will help you determine the appropriate values for your specific use case.
Resources:
-
High Availability lesson on ScyllaDB University and blog post