Migrating to ScyllaDB from Cassandra MongoDB DynamoDB

What does the typical migration process from (Cassandra/DynamoDB/Mongo) look like? How long does it take?

This is a very wide question (you can learn more in a Scylla University dedicated lesson).

You need to consider whether you’ll be doing a Hot/Live or a Cold/Offline migration.
There’s the live traffic aspect and the historical data aspect.

If you’ll be doing a HOT migration, here is the sequence you should follow:

  1. Deploy the destination cluster (ScyllaDB, with the proper schema/data modeling)
  2. Live traffic: Enable dual writes
  3. Historical Data forklift/migration
  4. Verification period: dual reads (source is still the source of truth)
  5. Cutover to the destination cluster (as the source of truth

Now let’s talk about historical data migration.

The following tools can be used to migrate historical data from Cassandra:

  • sstableloader (note to use the sstableloader that comes with Scylla, not the Cassandra one)
  • CQLSH COPY TO / FROM (export to csv, mostly up to 2M rows)
  • Similar tool to COPY FROM / TO is dsbulk, which has some optimizations that COPY TO / FROM doesn’t have
  • Copy over sstable files and using nodetool refresh
  • Scylla Migrator (Spark based) - full scan/cql reads on the source and cql writes to the destination.

The following tools can be used to migrate historical data from DynamoDB to ScyllaDB (CQL API / DynamoDB compatible API, a.k.a Alternator):

In order to migrate from MongoDB you’ll 1st need to design your schema / data model. A nice tool that can help you with that is Hackolade

As for data migration tools:

  • Export data to CSV and import to Scylla
  • Scylla Migrator

Happy Migration!

Thanks @Tomersan! Very helpful.
I’ll add that if you’re doing a migration, get in touch with us, we’re happy to help.
Also, in the upcoming LIVE event, we’ll have a dedicated session on migrating from DynamoDB.