Originally from the User Slack
@Shivaprasad_BhatShivaprasad_Bhat**:** Hello.. I added a node to an existing Scylla cluster.. Running nodetool status
on any of the cluster nodes shows all nodes as UN
..
Datacenter: ap-south
====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UJ 70.0.128.94 ? 256 ? ? 1a
UN 70.0.133.60 ? 256 ? 065c68bc-2a0a-4e1b-a7f7-bf01e3fed28e 1a
UN 70.0.159.148 ? 256 ? 8ecdcd3f-2ed3-4bd4-b781-ee51235e5af1 1b
UN 70.0.171.148 ? 256 ? 91e2c9db-00a4-489b-b9e3-b10b790d0ec4 1c
It’s been ~~4 hours and all i see is this. not even sure if it’s working
(we have ~~90GB of data).
I see these logs;
Aug 28 17:20:02 ip-80-0-128-94 scylla[5843]: [shard 1:strm] large_data - Writing large partition candle_service_db/candles: (6232244 bytes) to md-3gt7_196b_10ng02h84jpyt6krtt-big-Data.db
Aug 28 17:20:14 ip-80-0-128-94 scylla[5843]: [shard 0:strm] large_data - Writing large partition candle_service_db/candles: (5590834 bytes) to md-3gt7_19d8_2fg5c2x1ioazjbqbj5-big-Data.db
Aug 28 17:20:20 ip-80-0-128-94 scylla[5843]: [shard 1:strm] large_data - Writing large partition candle_service_db/candles: (6316143 bytes) to md-3gt7_196b_10ng02h84jpyt6krtt-big-Data.db
Any th@aviughts here?
@avi**:** Check the Advanced dashboard, look for Streaming I/O bandwidth and CPU, it indicates bandwidth for streaming
But if it’s just 90GB, it’s probably stuck, what version a@Shivaprasad_Bhate you running?
@Shivaprasad_Bhat**:** i let it run and went to sleep. it was done by morning. might have been because of the small node size we hav@avi.
we are running scylla 6.0.0
@avi**:** It shou@Shivaprasad_Bhatd have completed in a few minutes
@Shivaprasad_Bhat**:** We had a cluster of 4 nodes with 2 cores each (i4i.large
).. I have upgraded this to 3 node cluster with 4 cores each (i4i.xlarge
)..
Could this time be related to having very large partitions? we have time-series sort of data but partitions are not time bucketed so they are ever growing (many partitions are around 70MB – checked from the large_paritions system table). Also, the compaction is set to STCS which is also not ideal for time-series@avi.. would that compaction affect the time as well?
@avi**:** 70MB is not very large, and should not affect steaming performance@Shivaprasad_Bhat
STCS is not ideal but should also not be a problem.
@Shivaprasad_Bhat**:** Ohh. Overall, I have seen that whenever I try to add a node, compaction runs for a few hours first And then streaming starts which also takes a few hours. Not sure if it was because we had 2 core nodes befo@avie which was not enough (during this, cpu is maxed out on all nodes).
@avi**:** C@Shivaprasad_Bhatmpacting 70GB should take ~15 minutes, not hours
What hardware is this?
@Shivaprasad_Bhat**:** i4i.large
AWS. Nitro S@aviD. 2 vCPU, 16GB memory
Any pointers on what other factors would impa@Shivaprasad_Bhatt compaction time?
@avi**:** It should run at 10-40 MB/s/shard, depending on the data model
@Shivaprasad_Bhat**:** I’m suspecting an issue with the data model itself (we also see random latency spikes which are increasing with increasing data size). It’s pure time series data but with STCS, and no time bucketing.
https://scylladb-users.slack.com/archives/C2NLNBXLN/p1757748173227739
The use case is similar to this one. Let me know your thoughts / suggestions on this as well. I’m going through the Scylla data modeling course as well
@avi**:** If you have tiny cells and/or huge amounts of tombstones, it can slow down compaction.
You can try moving to i7i instances which have faster CPUs