Hi!
My cluster uses alternator and does not have any calls to cqlsh. I found an error log in the log of one of the nodes in the cluster as follows:
Feb 04 09:37:39 Scylla-node-11 scylla[88024]: [shard 0] storage_proxy - Failed to apply mutation from 10.6.0.11#0: utils::internal::nested_exception<std::runtime_error> (Could not write mutation alternator_test_table:test_table (pk{0015717a72732d7173ff6c736574313034375f4834}) to commitlog): std::invalid_argument (Mutation of 36005749 bytes is too large for the maximum size of 16777216)
I found from the code that max_mutation_size = 1/2 of commitlog segment size, which is 16M.
But what makes me wonder is how this 36M commitlog entry was generated? Writing an item that is too large directly from Alternator will return a 413 Request Body Too Large error.
So besides writing commitlog during I/O, are there any other ways to generate commitlog?
It is possible to generate a too large mutation, even if the client protocol (CQL or Alternator) itself tries to guard against it: via read-modify-write.
I don’t know if Alternator has any such write-paths, @nyh do you know of any Alternator command does such read-modify-write?
Another explanation is the discrepancy between JSON (the serialization format used by Alternator) and the mutation format used internally – namely that even though some item is still small enough as JSON but when re-serialized to frozen_mutation, it is now too large. This is hard to imagine because JSON itself is a very bloated text-based serialization format, while our frozen_mutation format is a binary representation (although a bloated one to be sure).
@bo_li you’re right that Alternator prevents you from creating a huge item via PutItem, because as you noted the size of the request itself is limited to content_length_limit = 16 * MB. However, as @Botond_Denes noted you can still create larger items incrementally, e.g., by setting additional attributes on an existing item, or by enlarging an existing attribute. For example consider a read-modify-write UpdateItem which asks to append an element to a list attribute. It’s a tiny request, but each time you run it the attribute grows a little bit longer, and the mutation that writes the new full value of the attribute (yes, that’s how it is done) grows larger and larger.
Do you know if your workload has such things - like appending things to existing lists, sets or maps or strings?