Hi @Stewart_Han ,
It is not about cpu shares, the workload type is used by ScyllaDB to decide on the proper action during
excesive load on the system.
For batch
workloads , where the concurrency is bounded, we can throttle the load by delaying
answers to the client, this in turn will make him send less requests per second for each thread.
For interactive
workload, throteling doesn’t make much sense since the requests are not queued on
the client side so the only resort is to fail early and signal to the client that we are overloaded (OverloadedException), this behaviour is a little bit speculative since we need to predict in advance our ability to serve a specific requests withing the timeout limits.
Here is a pointer to the code where we drop requests for interactive workload if we predict that we will
not be able to meet the timeout:
If the workload is not interactive scylla will just continue normally, assuming that the workload will converge to the maximal throughput possible.