Originally from the User Slack
@Pranit_Chugh: Hi Community, I am trying to setup scylla 3 node cluster on production on i4i instance. I am a little apprehensive about using ephemeral storage as a primary disk.
I have read @Felipe_Cardeneti_Mendes answer to this here : https://www.scylladb.com/2023/07/17/top-mistakes-with-scylladb-storage/ : Locally-attached disk mythbusting
But I’m still not sure if a node goes down or is stopped and started, other than backup through replication what are my options for data recovery?
Also I want to know would it be suggested to use an ebs disk attached with i4i for backup. If yes how will that work? Read and write through nvme ssd and backup on ebs or is it not possible/recommended.
If attaching ebs with nvme disk wouldn’t be utilising i4i’s cabability, then what would be the next best bet for instance type for scylla with persistant storage?
@Botond_Dénes: I don’t about AWS best practices, but if you have a ScyllaDB cluster with 3 noes and you create your keyspace with RF=3, you read/write with CL=QUORUM, your data should be safe, even if a node goes down. Don’t forget to repair regularly.
@Pranit_Chugh: @Botond_Dénes thanks for taking out time to answer the question. But this doesn’t give me full clarity about data backup.
With ebs disks even if all nodes go down I can replace nodes attach ebs and am good to go.
But with ephemeral storages not sure what the recommended path for backup is.
@Botond_Dénes: You can use ScyllaManager to do backups. It is free with a small number of nodes AFAIK.
@Stewart: @Pranit_Chugh
scylladb is architecturally recommended to use a local ssd instance, and reliability is ensured by backup and replication factors.
If you are interested in using EBS and local ssd simultaneously for greater reliability, check out this related article.
https://discord.com/blog/how-discord-supercharges-network-disks-for-extreme-low-latency
This article details the work done on discord to raid local ssd and EBS to increase reliability.
How Discord Supercharges Network Disks for Extreme Low Latency
@Pranit_Chugh: Thanks this is a beautifully written article and tells exactly what I wanted to know.
@Stewart: Additionally, we are currently using the i4i type. This type is very fast and can achieve amazing scylladb performance (i3en) compared to the i3en type. Unfortunately, it has smaller disk than the i3en type, but it is recommended for very high performance instances.
We are also working on how to RAID EBS and local SSDs on Kubernetes based on that article.