Originally from the User Slack
@Chaitanya_Tondlekar: what does the below error means ? [shard 3] reader_concurrency_semaphore
Dec 06 06:46:15 NODENAME scylla[407550]: [shard 3] reader_concurrency_semaphore - (rate limiting dropped 12 similar messages) Semaphore _read_concurrency_sem with 76/100 count and 2247533/167604387 memory resources: timed out, dumping permit diagnostics:
permits count memory table/description/state
73 73 2119K Keyspacename.tablename/data-query/inactive
1 1 31K Keyspacename.tablename/data-query/active/unused
1 1 26K Keyspacename.tablename/data-query/active/used
1 1 18K Keyspacename.tablename/mutation-query/inactive
1 0 0B Keyspacename.tablename/mutation-query/waiting
121 0 0B Keyspacename.tablename/data-query/waiting
98 0 0B Keyspacename.tablename/data-query/waiting
11 0 0B Keyspacename.tablename/data-query/waiting
307 76 2195K total
Total: 307 permits with 76 count and 2195K memory resources
Dec 06 06:46:47 NODENAME scylla[407550]: [shard 3] reader_concurrency_semaphore - (rate limiting dropped 21 similar messages) Semaphore _read_concurrency_sem with 90/100 count and 3156476/167604387 memory resources: timed out, dumping permit diagnostics:
permits count memory table/description/state
86 86 2561K Keyspacename.tablename/data-query/inactive
4 4 522K Keyspacename.tablename/data-query/active/used
1 0 0B system.local/shard-reader/waiting
6 0 0B Keyspacename.tablename/mutation-query/waiting
143 0 0B Keyspacename.tablename/data-query/waiting
15 0 0B Keyspacename.tablename/data-query/waiting
112 0 0B Keyspacename.tablename/data-query/waiting
367 90 3082K total
Total: 367 permits with 90 count and 3082K memory resources
@Guy ^^
Scylla version : 5.2.19
Compaction strategy : SizeTieredCompactionStrategy
Latencies are shooting up in seconds whereas reads/writes are having same.
Also cache HITS are being too high.
@avi can you help here ?
@Guy: Hey @Chaitanya_Tondlekar, did you figure this out?
@Chaitanya_Tondlekar: nope we are not able to understand
@avi: You probably have a hot partition. Try the nodetool toppartitions
command to find out which.
@Chaitanya_Tondlekar:
nodetool toppartitions
READS Sampler:
Cardinality: ~256 (256 capacity)
Top 10 partitions:
Partition Count +/-
(Keyspacename:table_name) 8a9165b9-1e1d-4fbb-9bf3-d3fe7129337f 82 24
(Keyspacename:table_name) 29fdf43d-a7b3-40f5-b9f9-a15d1aaa3298 69 62
(Keyspacename:table_name) 3750a8ff-9b44-4c2c-bfc7-7bc20efd8451 68 56
(Keyspacename:table_name) 129425287 67 62
(Keyspacename:table_name) 251448133 66 63
(Keyspacename:table_name) 14874611 66 63
(Keyspacename:table_name) 183017481 66 63
(Keyspacename:table_name) 234159255 66 63
(Keyspacename:table_name) 55127388 66 63
(Keyspacename:table_name) 434705854 66 63
WRITES Sampler:
Cardinality: ~256 (256 capacity)
Top 10 partitions:
Partition Count +/-
(Keyspacename:table_name) 225719967 178 177
(Keyspacename:table_name) 133822383 176 175
(Keyspacename:table_name) 286389998 175 173
(Keyspacename:table_name) 234924349 175 174
(Keyspacename:table_name) 329955233 175 174
(Keyspacename:table_name) 415441379 175 174
(Keyspacename:table_name) 369115272 175 174
(Keyspacename:table_name) 430349742 175 174
(Keyspacename:table_name) 329128616 175 174
(Keyspacename:table_name) 9661afd7-c372-4941-acc9-2a35c126599a 175 174
@avi: Add -s 2000
to get better discrimination
@Chaitanya_Tondlekar:
nodetool toppartitions -s 2000
READS Sampler:
Cardinality: ~2000 (2000 capacity)
Top 10 partitions:
Partition Count +/-
(Keyspacename:tablename) 85e12246-763e-4c3c-bd2f-1d8e3b56189e 15 0
(Keyspacename:tablename) 3dd943ae-4cbd-46ac-ba7a-91738114090e 12 3
(Keyspacename:tablename) 284444143 12 6
(Keyspacename:tablename) 399914274 12 7
(Keyspacename:tablename) 194684792 11 6
(Keyspacename:tablename) 355442722 11 7
(Keyspacename:tablename) 118072289 11 7
(Keyspacename:tablename) 143560999 11 8
(Keyspacename:tablename) 122656064 11 8
(Keyspacename:tablename) 426522453 11 8
WRITES Sampler:
Cardinality: ~2000 (2000 capacity)
Top 10 partitions:
Partition Count +/-
(Keyspacename:tablename) 283800415 26 12
(Keyspacename:tablename) 427671091 24 21
(Keyspacename:tablename) 98204371 24 23
(Keyspacename:tablename) 115037662 24 23
(Keyspacename:tablename) 434706656 24 23
(Keyspacename:tablename) 314526053 24 23
(Keyspacename:tablename) 377740394 24 23
(Keyspacename:tablename) 194061696 24 23
(Keyspacename:click_anon) 5c1413bc-1d11-430c-9998-a843fa3455e5 24 23
(Keyspacename:tablename) 114357021 24 23
@avi: Looks like you don’t have a hot partition
Look at the advanced dashboard, per shard, CPU panel
@Chaitanya_Tondlekar: when issue was happening, we saw that cache hits went high for long time and it came down after sometime on its own.
@avi: This is typical of a hot partition. Maybe the call to nodetool was after the event passed.
@Chaitanya_Tondlekar: cool