Originally from the User Slack
@Terence_Liu: I am experimenting with batched prepared statements with the Rust driver. Seeing
Error: Invalid message: Frame error: Could not serialize frame: Length of provided values must be equal to number of batch statements (got 100 values, 200 statements)
when N statements > N values. However, when N statements < N values, it’s silently going through. Is that the expected behavior?
Figure I should post in the main channel, because this should be a server-side question.
@Felipe_Cardeneti_Mendes: did you see https://rust-driver.docs.scylladb.com/stable/queries/batch.html#batch-values ?
Length of batch values must be equal to the number of statements in a batch.
Each statement must have its values specified, even if they are empty.
See the boilerplate example on how to specify empty statements
Batch statement | ScyllaDB Docs
empty values*
now I would expect the latter to also fail. Hopefully nothing weird happens, please open an issue (https://github.com/scylladb/scylla-rust-driver/) accordingly
@Terence_Liu: Yes. Although my use case is different. I was trying to bulk ingest with the same statement. So I went like this:
let N = 100;
let mut batch = Batch::default();
for _ in 0..N {
batch.append_statement("INSERT INTO metadata.metadata (id, data) VALUES(?, ?)");
}
let mut prepared_batch = session.prepare_batch(&batch).await?;
for chunk in &db_iter.into_iter().chunks(N) {
...
let mut values = vec![];
let mut c = 0;
for item in chunk {
values.push((id_str.to_owned(), data));
c += 1;
}
if c != N {
// init new prepared_batch to size
batch = Batch::default();
for _ in 0..c {
batch.append_statement(
"INSERT INTO metadata.metadata (id, data) VALUES(?, ?)",
);
}
prepared_batch = session.prepare_batch(&batch).await?;
}
session.batch(&prepared_batch, values).await?;
}
It’s somewhat awkward, but my understanding is you need to duplicate the prepared statement 100 times in a batch, even if they are identical. And the N of values need to match N of statements in a batch. So I had to take care of the last batch being not rounded to the step size.
I found out about the issue when I tried to play around and commented out the initial
// for _ in 0..N {
// batch.append_statement("INSERT INTO metadata.metadata (id, data) VALUES(?, ?)");
// }
basically not filling the statements batch. But the code still ran. Not sure if it actually inserted anything. Checking.
Yeah, I looked at the count through cqlsh, and the ingested count was 0
If I only have 1 statement in the batch, and 100 values, it would just ingest 1 single entry.
So maybe the client is doing an iterator zip, and trimmed the extra values
@Felipe_Cardeneti_Mendes: right, that’s what I meant when I asked to open an issue. I just tried it and see the same. Which is “kinda” fine, given that other drivers (eg: python) also require one to append the statement+values in one-shot. But I see where the confusion comes from. I think you raised a good point that silently ingesting just equal the number of bound statements and forgetting the remaining ones is prone to misconceptions.
so either way, just append and push values as you go.
@Terence_Liu: I’ll open an issue.
> so either way, just append and push values as you go.
Do you mean create a prepared batch, prepare the batch, and add values every batch cycle? That would repeatedly prepare the same batch of identical statements (other than the last irregular batch), right?
@Felipe_Cardeneti_Mendes: start the batch, iterate. push statement+values.
when you hit your condition, apply the batch, start a new one.
you also have new_with_statements
if you know beforehand how much you will need
@Terence_Liu: Opened something https://github.com/scylladb/scylla-rust-driver/issues/1114. Hope I wrote it with better clarity.
GitHub: When N of batch of statements is smaller than N of values, session.batch()
silently drops extra values. · Issue #1114 · scylladb/scylla-rust-driver
@Karol_Baryła: We’ll investigate the issue soon. One note regarding the code: it would be better to just prepare a statement once, and use it to create batches. Then you will have no need to call prepare_batch
.
@Terence_Liu: Ohh, thank you. That’s improves the code a bit!
Cloning is OK right?
let mut prepared_batch = Batch::default();
for _ in 0..N_BATCH {
prepared_batch.append_statement(prepared_stmt.clone());
}
@Karol_Baryła: Yes, we even have a note about this in documentation of PreparedStatement
: https://docs.rs/scylla/latest/scylla/statement/prepared_statement/struct.PreparedStatement.html#clone-implementation
PreparedStatement in scylla::statement::prepared_statement - Rust
@Terence_Liu: cool, thanks