What does the typical migration process from (Cassandra/DynamoDB/Mongo) look like? How long does it take?
This is a very wide question (you can learn more in a Scylla University dedicated lesson).
You need to consider whether you’ll be doing a Hot/Live or a Cold/Offline migration.
There’s the live traffic aspect and the historical data aspect.
If you’ll be doing a HOT migration, here is the sequence you should follow:
- Deploy the destination cluster (ScyllaDB, with the proper schema/data modeling)
- Live traffic: Enable dual writes
- Historical Data forklift/migration
- Verification period: dual reads (source is still the source of truth)
- Cutover to the destination cluster (as the source of truth
Now let’s talk about historical data migration.
The following tools can be used to migrate historical data from Cassandra:
-
sstableloader (note to use the
sstableloader
that comes with Scylla, not the Cassandra one) - CQLSH COPY TO / FROM (export to csv, mostly up to 2M rows)
- Similar tool to
COPY FROM / TO
is dsbulk, which has some optimizations thatCOPY TO / FROM
doesn’t have - Copy over sstable files and using nodetool refresh
- Scylla Migrator (Spark based) - full scan/cql reads on the source and cql writes to the destination.
The following tools can be used to migrate historical data from DynamoDB to ScyllaDB (CQL API / DynamoDB compatible API, a.k.a Alternator
):
- Scylla Migrator
- DynamoDB Streams
In order to migrate from MongoDB you’ll 1st need to design your schema / data model. A nice tool that can help you with that is Hackolade
As for data migration tools:
- Export data to CSV and import to Scylla
- Scylla Migrator
Happy Migration!
Thanks @Tomersan! Very helpful.
I’ll add that if you’re doing a migration, get in touch with us, we’re happy to help.
Also, in the upcoming LIVE event, we’ll have a dedicated session on migrating from DynamoDB.