In our use case, we’d like to fetch data from Scylladb and put it into Elasticsearch. if we take records one by one, it takes too much time.
I couldn’t find a ScyllaDB binlog.
What’s the right way to do this?
And if you want to read everything on top of live additions using CDC, you can just write a sample scala spark application that will just load everything needing a fulltext search from Scylla to Elastic (sample apps are on the internet or have a look at series of blogs around Scylla migrator, which explain how to properly leverage dataframes).
Fwiw, Scylla supports the operator LIKE, in case a simple search will cut it for you (assuming your partitions are not huge) instead of the Lucene query language and the inverted indexes Elastic uses.
Some useful links:
Not sure how useful this will be:
*The answer was provided on Stack Overflow by Lubos