How to bulk fetch data from ScyllaDB?

In our use case, we’d like to fetch data from Scylladb and put it into Elasticsearch. if we take records one by one, it takes too much time.
I couldn’t find a ScyllaDB binlog.
What’s the right way to do this?

*The question was asked on Stack Overflow by tianzhenjiu

You might want to look at using Change Data Capture in Scylla, then using the CDC tables to feed a Kafka topic that will populate Elasticsearch.

ScyllaDB’s CDC connector for Kafka is built on Debezium. You can read more about it here.

*The answer was provided on Stack Overflow by Peter Corless

And if you want to read everything on top of live additions using CDC, you can just write a sample scala spark application that will just load everything needing a fulltext search from Scylla to Elastic (sample apps are on the internet or have a look at series of blogs around Scylla migrator, which explain how to properly leverage dataframes).

Fwiw, Scylla supports the operator LIKE, in case a simple search will cut it for you (assuming your partitions are not huge) instead of the Lucene query language and the inverted indexes Elastic uses.

Some useful links:

Not sure how useful this will be:

*The answer was provided on Stack Overflow by Lubos