tzach
February 4, 2024, 6:49pm
1
The ScyllaDB team announces ScyllaDB Open Source 5.4.2, a bugfix release of the ScyllaDB 5.4 stable branch. ScyllaDB Open Source 5.4.2, like all past and future 5.x.y releases, is backward compatible and supports rolling upgrades .
Users are encouraged to upgrade to 5.4.2.
Related links:
Issue fixed in this release:
Performance: Bloom filter efficiency can be reduced after node operation. When writing an sstable, ScyllaDB estimates how many partitions it will have in order to size the bloom filter correctly. In some cases, the estimation was suboptimal for TWCS. #15704
Stability: commitlog replay can cause abort due to over-extended skip. During commitlog replay, ScyllaDB skips over corrupted sections. However if the corrupted section also has corrupt size, it can lead to a crash. #15269
Stability: compaction_manager::perform_cleanup does not handle condition_variable_timed_out, than may cause nodetool cleanup to fails with exit status 2. #15669
Stability: nodes crashing during repair operations (due to no reader-closing on unexpected exception) #16606
Stability: tasks: dangling reference to task’s child pointer #16380
Stability: tombstone might not be garbage-collected due to conflicts with data in commitlog. #15777
Correctness: a very rare bug in row cache might return a wrong value #15483
Java Tools: nodetool fails due to tderr: error: ‘java.lang.Object com.google.common.base.Objects.firstNonNull(java.lang.Object, java.lang.Object)’. Root cause is a 3rd party package, io.airlift.airline, API update,scylla-tools-java#374
Java Tools: update “guava” package to 32.1.3-jre scylla-tools-java#365
Java Tools: Upgrade scylla-driver-core from 3.11.2.5 to 3.11.5.1 scylla-tools-java#343
Java tools: Use newer hk2-locator in order to get rid of javassist scylla-jmx#231
Tools: scylla nodetool crashes if invoked without further args #16451
Tools: scylla-sstable tool crash due to unclosed reader in tools/schema_loader.cc #16519
Does this include the fix for this issue:
opened 11:12AM - 12 Jan 24 UTC
closed 12:28PM - 04 Feb 24 UTC
symptom/data consistency
P1
status/release blocker
backport/5.2
backport/5.4
*Installation details*
Scylla version: 5.4.1
Cluster size: 1 Node
Platform: D… ocker on Kubernetes
After Upgrading Scylla from 5.2.11 via 5.4.0 to 5.4.1, we started observing missing data in Scylla. Every once in a while, an `INSERT` is committed successfully but the inserted values are not visible until Scylla is restarted. As far as we can tell, version 5.2.11 was not affected while 5.4.1 is. 5.4.0 was not long enough in operation to make a reliable statement.
The suspected bug occurs extremely rarely, making it hard for us to reproduce. Out of ~800M inserts per day, only a dozen are affected. In our running environment, I was able to perform the following steps:
1. Execute the normal `INSERT` workload using multiple concurrent clients.
2. Wait until a client notices that a `SELECT` on a previously `INSERT`ed key returns no columns
3. Shut down all clients to make sure there is no unnecessary load on our ScyllaDB
4. (optional) wait as long as you want for ScyllaDB to (process backlog/perform compactions/achieve consistency/whatever...)
5. Execute query: `SELECT * FROM tenant_6e10XXXX_XXXX_XXXX_XXXX_XXXXXXXXXXXX.edgestore WHERE key = <affected_key>`. The result is empty.
6. Restart ScyllaDB and wait until it is operational.
7. Execute the same query again and notice the `INSERT`ed data suddenly became available:
|key|column1|value|
|--|--|--|
|<affected_key>|0xXX|0xXXXXXXXXXXXXXXXX|
|<affected_key>|0xXX|0xXXXXXXXXXXXXXXXXXXXX|
|<affected_key>|0xXXXX|0xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
|<affected_key>|0xXXXX|0xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
|<affected_key>|0xXXXX|0xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
|<affected_key>|0xXXXX|0xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
|<affected_key>|0xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|0xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX|
`X` obviously masks potentially private hexadecimal data which is not relevant to this report.
What I already tried instead of restarting Scylla:
- `nodetool refresh`
- `nodetool flush`
- `nodetool rebuild` and `nodetool repair` even though I'm aware both shouldn't make any difference on a single node cluster.
It looks to me like Scylla has somehow recognized the transaction as completed even though it has not been persisted as expected. There seems to be some procedure which is triggered either by the shutdown or by the startup that picks this transaction up and replays it so the its modifications actually become available.
**Appendix**: The Scylla [logs](https://github.com/scylladb/scylladb/files/13918121/logs-without-compactions.txt) captured during this operation. I removed all compaction logs to reduce the file to a reasonable size.
thx
tzach
February 5, 2024, 8:01am
3
No. the commit is backported to 5.4 and will be part of the next 5.4 patch release (5.4.3)
committed 12:46PM - 04 Feb 24 UTC
Commit e81fc1f095f0d39f4bbc6503960450b7fda3ba1b accidentally broke the control
f… low of row_cache::do_update().
Before that commit, the body of the loop was wrapped in a lambda.
Thus, to break out of the loop, `return` was used.
The bad commit removed the lambda, but didn't update the `return` accordingly.
Thus, since the commit, the statement doesn't just break out of the loop as
intended, but also skips the code after the loop, which updates `_prev_snapshot_pos`
to reflect the work done by the loop.
As a result, whenever `apply_to_incomplete()` (the `updater`) is preempted,
`do_update()` fails to update `_prev_snapshot_pos`. It remains in a
stale state, until `do_update()` runs again and either finishes or
is preempted outside of `updater`.
If we read a partition processed by `do_update()` but not covered by
`_prev_snapshot_pos`, we will read stale data (from the previous snapshot),
which will be remembered in the cache as the current data.
This results in outdated data being returned by the replica.
(And perhaps in something worse if range tombstones are involved.
I didn't investigate this possibility in depth).
Note: for queries with CL>1, occurences of this bug are likely to be hidden
by reconciliation, because the reconciled query will only see stale data if
the queried partition is affected by the bug on on *all* queried replicas
at the time of the query.
Fixes #16759
Closes scylladb/scylladb#17138
(cherry picked from commit ed98102c45a522393cd3bf478a0b6a712d192167)