Are BATCH insert/update into different tables logged + "safe"?

Hi, if I e.g. write to 2 or 3 different tables as part of a batch, are there any benefits or guarantees compared to doing the writes separately (not batched)?

The documentation mentions isolation/atomicity is only guaranteed within a single partition → of one table…?

By default, Scylla uses a batch log to ensure all operations in a batch eventually complete or none will (note, however, that operations are only isolated within a single partition).

Thanks,
Hartmut

The isolation/atomicity guarantees a batch statement provides is only valid within a single partition of a single table. If you batch updates to multiple partitions or multiple tables, there are no guarantees and you are better off doing separate writes.

Thanks for your reply @Botond_Denes.
Could you please elaborate? Does BATCH in Scylla differ from Cassandra?
According to the docs, single partition batches are isolated.
If multiple partitions/tables are involved atomicity should be guaranteed via the LOGGED mechanism?

For multiple partition batches, logging ensures that all DML statements are applied. Either all or none of the batch operations will succeed, ensuring atomicity. Batch isolation occurs only if the batch operation is writing to a single partition.

I don’t see how any guarantees could be provided when multiple partitions are involved, beyond the built-in retry mechanism provided by the logged batch variant. ScyllaDB does not have an undo mechanism, so some writes that are part of a batch could succeed, while others not.

Do not mistake batches for transactions. They are not. There are benefits to batching, when writes target a single partition, as ScyllaDB will merge these writes into a single mutation and they will be applied internally as a single write. This does have benefits and additional guarantees. But as soon as multiple partitions or multiple tables are involved, batch is nothing more than a mechanism to send multiple separate writes in a single message. Logged variants come with built-in retry, but that is pretty much it as far as I know. Performance wise, batches have no benefits, on the contrary, separate writes are preferred if you want high performance.

1 Like

Thanks for confirming. I always found the docs (both Cassandra and Scylla) to be quite ambiguous on the matter…

Indeed, we now have tracking Docs: ScyllaDB Multi-Table BATCH behavior · Issue #15688 · scylladb/scylladb · GitHub :wink:

1 Like