Need help in explaination on rlatencyp95 metrics exposed by scylla

ChaitanyaT · April 24, 2024, 9:47am

We are facing latency issues and while debugging through grafana rlatencyp95, it is exposing on instance,shard level and cluster and DC level.
When the issue occurs, rlatencyp95 for instance level still seems in microseconds but on cluster level it seems spiking in seconds.
Whats the diff in metrics on instance and cluster level ?Below is the query which i used to debug the latency issue. I wanted a breakdown at instance level to identify if there is any particular instance causing this latency issue.

avg(rlatencyp99{by=“cluster”, instance, cluster=~“scyl-test”, scheduling_group_name!=“streaming”} > 0) by (cluster, instance)

But after applying this query, I observed that the latency shown at instance level is still in microseconds while the cluster level metrics still shows 30 sec as latency.
Can we get any official documentation for understanding the scope. ?

ChaitanyaT · April 24, 2024, 10:33am

above snapshot graph has 2 scylla nodes

Marcin_Maliszkiewicz · May 8, 2024, 12:30pm

Hi! Hot partition or other imbalance is the first thing to check, please experiment with:

and in general better to debug using scylla monitoring because we have a lot of specialized graphs for such things (e.g. it’s better to view the whole cluster on a per shard (vcpu) level).

ChaitanyaT · May 8, 2024, 12:48pm

Marcin, the problem over here is that this is happening to all nodes in same project (GCP ) irrespective of cluster.
So we feel its not workload issue. So any network related issue can cause this ?

Topic		Replies	Views
P99 and p95 spikes, hot partitions, performance and data modeling ScyllaDB data-model , performance , kubernetes , hot-partition	0	20	June 3, 2025
How to troubleshoot an issue with high latency happening every once in a while? ScyllaDB performance , troubleshooting , scylladb-monitoring , logging	1	39	March 26, 2025
[RELEASE] ScyllaDB 5.4.0 - Part 3 - Monitoring, Tools, Performance, Stability and more Release Notes open-source , open-source-release , open-source-5-4	0	889	December 6, 2023
Running on multiple data centers in different locations, latency (and performance) impact and async replication ScyllaDB	0	164	June 3, 2024
[RELEASE] ScyllaDB Monitoring Stack 4.5.0 Release Notes monitoring-release , scylladb-monitoring	0	337	October 22, 2023

Need help in explaination on rlatencyp95 metrics exposed by scylla

Related topics