Hi Community,
I’m using ScyllaDB 6.1 Open Source and have a table configured to store 30 days of data with the following compaction strategy:
compaction = {
** ‘class’: ‘TimeWindowCompactionStrategy’,**
** ‘compaction_window_size’: ‘3’,**
** ‘compaction_window_unit’: ‘DAYS’**
}
Currently, no reads are performed on this table—only data inserts.
Here’s what I observed:
- Before today (January 7), the table had SSTables grouped in 3-day windows, such as (Dec 23, Dec 25, Dec 28, Dec 31, Jan 3).
- Today (Jan 7), after triggering an autocompaction, I noticed that the compaction created a new table containing only data for Jan 7. The previous 3-day window grouping behavior seems to have changed unexpectedly.
I haven’t manually triggered any compaction apart from observing this autocompaction behavior.
Does anyone have insights into why this might have happened? Could it be related to some internal behavior or a specific setting I might have missed?
Any help or suggestions would be greatly appreciated!
I posted this question a few days ago, but I haven’t received a response yet. I wanted to provide some additional details about the issue I’m facing.
To recap:
I’m using ScyllaDB 6.1 Open Source and have a table configured to store 30 days of data with the following compaction strategy:
compaction = {
** ‘class’: ‘TimeWindowCompactionStrategy’,**
** ‘compaction_window_size’: ‘3’,**
** ‘compaction_window_unit’: ‘DAYS’,**
** ‘max_threshold’: ‘32’,**
** ‘min_threshold’: ‘4’**
}
Observations
- Pre-7 Jan Behavior: The table had SSTables grouped in 3-day windows, such as (Dec 23, Dec 25, Dec 28, Dec 31, Jan 3).
- On 7 Jan: After an autocompaction, a new SSTable was created for Jan 7 only, deviating from the expected 3-day grouping behavior.
Additional Issue
Upon further investigation, I noticed that instead of forming a single large SSTable for each 3-day window, there are multiple small SSTables within each window. These smaller SSTables are not being compacted into a single SSTable as I expected, even though the compaction strategy specifies min_threshold = 4
and max_threshold = 32
.
Questions
- Why are these smaller SSTables within the same 3-day window not being compacted into one large SSTable?
- Are there specific conditions under which TimeWindowCompactionStrategy avoids compacting SSTables, even when the
min_threshold
and max_threshold
values should allow it?
- Could there be an issue with autocompaction triggering or the way TimeWindowCompactionStrategy handles compaction windows for insert-only workloads?
I’d appreciate any insights or suggestions to troubleshoot and resolve this issue.
Thank you in advance!
TWCS only compacts expired windows. The current/active 3-day window is not compacted by default, your original post is from Jan 8th and you reference files from Jan 7th, so it seems it’s an expected behaviour .
Furthermore,
-
if you’re seeing multiple SSTables in the current window, and you’re expecting them to compact: they won’t be compacted yet.
-
The compaction is delayed until the window is considered “cold” (i.e., new writes are no longer hitting it).
-
If your write volume is too low to generate 4 SSTables per window (or it’s spread across the window unevenly), compaction won’t be triggered.
- For instance, if you get only 2-3 small SSTables in a 3-day window, TWCS won’t compact them unless they hit the
min_threshold
.
- In that case, small files pile up.
1 Like