We have our multi dc setup with 3 node in dc1 and 3 in dc2 . Although both the DCs are in the same region n same subnet, This is done as we require separate clusters for reading n writing data.
Our setup creates a new table everyday at 12 midnight with cdc enabled in DC1
At the same time we also create a kafka source connector to consume cdc logs from DC2 everyday
Issue:
At around 12
When creating a new source con , we observe scylladb servers on dc2 consumes 100% cpu.
We increased cpu from 16cores to 32 but still same behavior
Once the kafka connector creates its topic and start reading data from cdc log ,scylladb cpu cools down
Logs:
The logs in syslog shows reader_concurrency_semaphores for that time period.
Any expert thoughts is appreciated
Thanks in advance
Here is the link ,along with logs and other necessary details you need
The reader_concurreny_semaphore log only comes for 15 mins when a new kafka debezium source connector spawns. Cpu also goes high during that time .
Exact details are available in the link, please go
through it and help in understandng the exact issue
We manage to reduce 15 mins of High CPUby tweaking scylla.query.time.window.size in kafka source connector , but still no clue about 100% cpu utilisation
Also it is observed , our 6 node dc2 cluster and 4 nodes kafka connector, there is a total of 40000 active connection on each scylladb node