Major Compaction by Partition

We’re considering a major compaction in the next few days, due to massive partitions that we are going to shrink and then want to clean up soon afterwards. We believe they are causing performance issues on the cluster.

When looking at the nodetool compact doc page, I see there is a --partition <partition_key> option. The help text isn’t very helpful here, but does that mean we can join the partition columns of the table to specify only that partition(s) to be compacted?

Our primary key is similar to this:

PRIMARY KEY ((foo, bar, baz), id1, id2)

So, based on the doc page we could somehow combine foo, bar, and baz to compact a specific partition?

We’re also confused as to how that even works at the SSTable level, since there should be multiple partitions per SSTable file.

Hah! Good find. We actually removed the --partition option last week in docs: nodetool compact: remove unsupported partition option · scylladb/scylladb@70ba6b9 · GitHub

I guess that answers your question :wink:

FYI @Anna @Botond_Denes

Yep, that does indeed answer all of my questions, thank you Felipe.