ScyllaDB node spikes at 00h00 UTC

nuno · August 5, 2025, 10:32am

Installation details
#ScyllaDB version: 5.0.5-0.20221009
#Cluster size: 6 nodes (3 - us-east-1; 3 - us-east-2) Replication factor 3
os (debian10-base-amd64-202408061511):

Hello!

We randomly have ScyllaDB load/latency spikes in some nodes every day shortly after 00h00 UTC. This lasts for about 30 seconds, in each node, and during that period, all the queries to node time out, it looks like that node is not available.

In the metrics of the EC2 instance, we see this load spike:

The compactions are running during the day and nothing out of normal is running at this time. We also don’t have a spike in throughput, in fact is decreasing at this time of the day. We don’t have any scheduled jobs, backup, repair, etc, scheduled for this time frame.

Do you have any idea why this might be occurring? We are running out of ideas…

Thank you!

avikivity · August 5, 2025, 1:35pm

5.0.5 has reached end-of-life ages ago.

The problem you’re describing is likely due to fstrim running to discard unused disk space. It has been replaced (in 5.1 timeframe) by online discard (see dist: scylla_raid_setup: mount XFS with online discard · scylladb/scylladb@a19d00e · GitHub). Note that upgrading won’t transition the nodes to online discard, you either have to apply the changes manually, or bootstrap new nodes and decommission the old ones.

nuno · August 5, 2025, 3:21pm

Thank you for your reply.
We’ve checked and fstrim is disabled.
Nevertheless, upgrading the ScyllaDB version seems like a good idea.

Topic		Replies	Views
Scylla in EC2 EC2 load high ScyllaDB	6	365	August 1, 2023
Scylla 5.2 very slow startup ScyllaDB troubleshooting	19	1081	October 9, 2023
Loss of Availability and Timeout errors, Kubernetes nodes de-scheduled ScyllaDB troubleshooting , kubernetes , high-availability	0	187	May 20, 2024
Mutliple Datacenter cluster, diagnosing high latency spike and performance issues ScyllaDB performance , drivers , compaction , scylladb-monitoring	0	35	April 20, 2025
Performance issue, throughput drop and latency increase ScyllaDB data-model , performance , troubleshooting , sizing	0	195	November 28, 2024

ScyllaDB node spikes at 00h00 UTC

Related topics