This is a question that I often get:
I’m currently evaluating different databases for my application. What is the sweet spot for ScyllaDB?
I’d say it’s a need for HIGH volume throughput with HIGH cardinality and LOW (15ms or even single digit) tail (p95/99) latency.
Many times our I/O scheduler will do most of the prioritization for you.
Application not optimized enough? reach out to our Solution architects for advise on better data modeling.
Want more optimization tips? Make sure to check Scylla Monitoring:
Have cardinality problems? check this out.
I agree with what Tomer wrote about performance (High throughput, low latency).
I’d also add to the sweet spot High Availability and Big Data:
High availability, fault tolerance, and disaster recovery: Scylla is designed to be highly available. Data is replicated regardless of geographic location, and there is no single point of failure. This means that your system remains up and running even if something goes wrong. The system is topology-aware, meaning that you can create redundancy using multiple data centers and multiple racks within each data center. An example is from Kiwi.com, a popular online travel website running ScyllaDB, and the OVHcloud fire. A fire broke out in a room at the SBG2 data center of OVHcloud, a popular French cloud provider. Within hours the fire had been contained, but not before wreaking, causing a lot of damage. It knocked out about 3.6 million websites spread across 464,000 domains. However, Kiwi.com, kept on running. This is because it had two other data centers which were able to take over.
A high volume of data: Some large organizations use ScyllaDB to manage petabytes of information while still getting great performance.
There are other reasons that teams choose ScyllaDB:
Avoid vendor lock-in: No one wants to be stuck with one provider. If you’re currently using DynamoDB, migrating to ScyllaDB is a great alternative (project Alternator). It supports the same client SDKs, data modeling, queries, and so on. However, you can deploy it on-premises, on any public cloud, or using ScyllaDB Cloud. Going back to Tomer’s point about performance, you’ll also get way better performance.
Transparent and open-source: ScyllaDB is open-source. This increases the speed of development, innovation, and reliability (while avoiding vendor lock-in, see the above point). Also cost-effective. Given enough eyeballs, all bugs are shallow (Linus’s Law).
Reduced costs: ScyllaDB is cost-effective. As an example, DynamoDB is often up to 7x more expensive with similar or better performance. Compare costs vs. Astra, Keyspaces, and ScyllaDB Cloud with this cost calculator.
Ease of use: ScyllaDB has auto-tuning capabilities. If you’re coming from Apache Cassandra, you can stop worrying about things like garbage collection and constantly trying to tune the JVM.
High Scalability: Scale horizontally by adding more nodes. No downtime is required. This is valuable if you have many concurrent users and you’re expecting to grow.
Familiar Interfaces: Use either CQL or the DynamoDB API. CQL is similar to SQL, providing users comfortable with relational databases a lower barrier of entry.
I’d recommend a relational database and not ScyllaDB if:
- You’re running an application that uses a small amount of data (and you don’t expect it to grow). If one node is enough, don’t use a distributed data system.
- You require strong ACID (atomicity, consistency, isolation, durability) compliance. While you can get high consistency with ScyllaDB, its nuanced and Dynamo-style databases make trade-offs with ACID.