Replication Strategy Change from Simple to Network

GarrettPoore · January 18, 2023, 7:35pm

Hi, I have several Scylla clusters that have been running in production for multiple years. I realized last year that we still use the default SimpleStrategy for replication on all of our data. This is obviously a bad choice for production, and I have switched over what keyspaces I could using the recommended procedure.

The problem I have however, is that our largest keyspaces are 100s of TBs, and just running some exploratory repairs gives me an estimate of this whole procedure taking ~40 hours. My same estimates for the smaller keyspaces ran over almost x2, so this could take up to 80 hours. I cannot get that long of a downtime, because it will require a full company outage. So, I’m exploring other options.

The main alternatives that I’ve been able to come up are not great, so hoping that maybe there is some feature of Scylla that I’m missing that will make these easier, or some other method to accomplish this:

Create whole new cluster and move data over
- Doubles cluster cost until completed
- Requires mirroring the data, likely with a custom app since multiple datacenters would handle this in any normal situation
- Still requires a downtime at the end for a final sync, but likely smaller window
- All apps can be switched over at once, fairly easily
Create new keyspaces on the current cluster, copy data over
- Requires x2 disk space
- Process would be something like this:
  - Create new keyspace with correct settings
  - Snapshot old keyspaces
  - Copy snapshots into new keyspace
  - Run repairs
- My assumption is that I would still need to do the repairs from the recommended approach, but I could do it while the normal keyspace is being used
- If this is true, then it still requires a large final repair to finalize, so may run over on time as well

GarrettPoore · January 25, 2023, 4:43pm

In addition to my ideas here, I got the following from Felipe on Slack:

Update writes to CL = ALL, run the initial repair, deal with less performance until strategy switch, then change read CL = ALL until all data has been repaired again
- My thought here is it could alternatively cut the downtime in half, running that initial repair live and keeping data consistent until the downtime for the strategy switch and final repair
Migrate to new cluster with scylla-migrator (or load-and-stream with 5.1), write to both clusters, switch whenever possible

Lubos_Kosco · March 29, 2023, 12:45pm

What is the nodetool status of that problematic cluster? (you can ping it on slack to me as well (Lubos) to keep privacy)

Lots of things can be done(or can be skipped) if the layout of the cluster is prepared for the switch.

Lubos_Kosco · March 29, 2023, 5:38pm

So Garrett has 3 racks, lots of nodes and most nodes only 15% disk free and using Scylla 4.5 with STCS as compaction strategy
Garret also uses nodetool repair -pr for repairs

in such case having Scylla Enterprise with ICS and Scylla Manager(SM) would mitigate the situation a bit

Anyhow Felippes idea is safe.

If you however feel adventurous you can go the route that you repair the cluster, then if you have SM you can resume repair after writes are paused (or run repair anew).
Then after ALTER if you resume writes you should have them consistent.
But reads will still be inconsistent, roughly 11% of the data will be on RF=1 until you repair it.
So until this repair is done, the reads will be eventually consistent despite using QUORUM.
If one can live with that for the duration of second repair, you can have write downtime ideally for single repair (or for resume from SM)

Speed up of repair can be done by using NullCompactionStrategy - assuming there is enough disk space to accommodate streamed repairs and writes for the duration of repair - which with 15% free won’t be flying.

Anyhow, I am curious how Garrett will decide

Topic		Replies	Views
Running Repair after changing the Replication Factor ScyllaDB	1	153	November 14, 2022
Out scaling a cluster, restoring a backup into a new cluster to avoid streaming ScyllaDB performance , administration , backup-restore	0	85	December 1, 2024
Process for making configuration changes on a running cluster, with no downtime. Is nodetool drain required? ScyllaDB administration , configuration	0	77	June 4, 2024
Nodetool Refresh to Migrate Data ScyllaDB migration , nodetool	3	250	May 26, 2024
Production setup for single node instance ScyllaDB	3	451	March 11, 2024

Replication Strategy Change from Simple to Network

Related topics