Will `Compaction` clean up all existing tombstones on the node?

I have a question. Compaction will clean the tombstone. When we proactively execute Compaction, will it clean up all the tombstones that have already been generated? Or is it cleaning up some of the tombstones that have already been produced?
When compaction occurs, the data will be expunged completely and the corresponding disk space recovered. I see this sentence here.

Tombstones typically have a data it shadows. For example, consider:

INSERT INTO ks.t (key, val) VALUES (0, 0);

At a later time you then issue:

DELETE FROM ks.t WHERE key=0;

In that case, your data may live on a given SSTable, whereas the tombstone will live in another.

When compaction compacts the SSTable containing the data along with the SSTable containing the tombstone, your data will be evicted. If the tombstone has already expired (ie: Past gc_grace_seconds), then the tombstone will be evicted as well. Otherwise, the tombstone will be kept, to ensure you have a time to repair in case that tombstone didn’t reach other nodes as part of your original write.

If I execute nodetool compact on node A, will all generated tombstones on that node be cleared? Does Command nodetool compact only clean up expired tombstones?That is to say, tombstones can only be cleared after they expire. If the value of gc_grace_seconds is 10 days, then the tombstone will only be cleared after 10 days. Whether or not I have executed nodetool compact within these 10 days? Thanks!

When you run a major (nodetool compact), all expired tombstones will be evicted as they will be compacted along with the data they shadow. Non-expired tombstones will still be kept, but this shouldn’t be a problem as tombstones are really small, and since you already performed a major, then it means that data it shadows got evicted.

As a result, as soon as the tombstone expires after your major, compaction will eventually evict it.

2 Likes