Large io starvation with low query IO delay


in our load testing we are currently seeing a large Disk query starvation time (up to 30ms) and low disk query IO queue delay (<1ms).

This combination is not mentioned in the monitoring dashboards. What does this indicate?


Hard to tell without additional info.
Can you please follow How to Report a ScyllaDB Problem | ScyllaDB Docs and share more info?

sure, we can get more stats. I just did not want to overload you with specifics.

I was more interested in understanding the different metrics. Here is our understanding of the metrics:

  • DISK starvation time - Time the request waited (where?) before it got dipatched to the seastar IO queue?
  • I/O Queue delay - Time how long the request waited in the seastar IO queue?
  • Disk I/O Queue Delay - Time the actual IO took? This means from the point when the IO request was sent to the operating system until seastart got a response?

Is this explanation correct?

Its worth noting, that we have kernel block cache activated, as we run on HDDs here.

Are you interested in more metrics? If so, we’ll provide more detailed metrics.