Hi all. Can you give me some advice?
[cqlsh 5.0.1 | Cassandra 3.0.8 | CQL spec 3.3.1 | Native protocol v4]
docker-compose exec scylla_node scylla --version
5.2.7-0.20230821.e0ebc95025d1
I’m using multiprocessing + asyncio for parallelism and asynchronous requests without blocking. Database queries are executed cyclically.
Each process has its own cluster + session. Everything works fine for a while, but then the SELECT queries fail.
Server error: code=1200 [Coordinator node timed out waiting for responses from replica nodes] message="Operation timed out for directory.candles - only 0 responses received from 1 CL=LOCAL_ONE." info={'consistency': 'LOCAL_ONE', 'required_responses': 1, 'received_responses': 0}
Scylladb logs:
scylla_node | 2024-02-16T18:45:35.215232875Z INFO 2024-02-16 18:45:35,214 [shard 0] reader_concurrency_semaphore - (rate limiting dropped 1713 similar messages) Semaphore _read_concurrency_sem with 100/100 count and 4311954/170750115 memory resources: timed out, dumping permit diagnostics:
scylla_node | 2024-02-16T18:45:35.215255669Z permits count memory table/description/state
scylla_node | 2024-02-16T18:45:35.215258548Z 99 99 4192K catalog.candles/data-query/inactive
scylla_node | 2024-02-16T18:45:35.215260182Z 1 1 19K catalog.candles/data-query/active/used
scylla_node | 2024-02-16T18:45:35.215261757Z 13928 0 0B catalog.candles/data-query/waiting
scylla_node | 2024-02-16T18:45:35.215263195Z
scylla_node | 2024-02-16T18:45:35.215264520Z 14028 100 4211K total
scylla_node | 2024-02-16T18:45:35.215265949Z
scylla_node | 2024-02-16T18:45:35.215267281Z Total: 14028 permits with 100 count and 4211K memory resources
scylla_node | 2024-02-16T18:45:35.215268677Z
scylla_node | 2024-02-16T18:46:05.218802220Z INFO 2024-02-16 18:46:05,218 [shard 0] reader_concurrency_semaphore - (rate limiting dropped 3386 similar messages) Semaphore _read_concurrency_sem with 100/100 count and 2120008/170750115 memory resources: timed out, dumping permit diagnostics:
scylla_node | 2024-02-16T18:46:05.218820925Z permits count memory table/description/state
scylla_node | 2024-02-16T18:46:05.218823246Z 99 99 2048K catalog.candles/data-query/inactive
scylla_node | 2024-02-16T18:46:05.218824812Z 1 1 22K catalog.candles/data-query/active/used
scylla_node | 2024-02-16T18:46:05.218826303Z 7870 0 0B catalog.candles/data-query/waiting
scylla_node | 2024-02-16T18:46:05.218827761Z
scylla_node | 2024-02-16T18:46:05.218829049Z 7970 100 2070K total
scylla_node | 2024-02-16T18:46:05.218837258Z
scylla_node | 2024-02-16T18:46:05.218838734Z Total: 7970 permits with 100 count and 2070K memory resources
scylla_node | 2024-02-16T18:46:05.218840109Z
scylla_node | 2024-02-16T18:46:35.231726618Z INFO 2024-02-16 18:46:35,231 [shard 0] reader_concurrency_semaphore - (rate limiting dropped 3704 similar messages) Semaphore _read_concurrency_sem with 98/100 count and 2080320/170750115 memory resources: timed out, dumping permit diagnostics:
scylla_node | 2024-02-16T18:46:35.231762714Z permits count memory table/description/state
scylla_node | 2024-02-16T18:46:35.231764898Z 97 97 2007K catalog.candles/data-query/inactive
scylla_node | 2024-02-16T18:46:35.231766465Z 1 1 25K catalog.candles/data-query/active/used
scylla_node | 2024-02-16T18:46:35.231767911Z 541 0 0B catalog.candles/data-query/waiting
scylla_node | 2024-02-16T18:46:35.231769333Z
scylla_node | 2024-02-16T18:46:35.231770741Z 639 98 2032K total
scylla_node | 2024-02-16T18:46:35.231772107Z
scylla_node | 2024-02-16T18:46:35.231773430Z Total: 639 permits with 98 count and 2032K memory resources
scylla_node | 2024-02-16T18:46:35.231774848Z
scylla_node | 2024-02-16T18:48:39.657502001Z INFO 2024-02-16 18:48:39,656 [shard 0] reader_concurrency_semaphore - (rate limiting dropped 202 similar messages) Semaphore _read_concurrency_sem with 100/100 count and 2730530/170750115 memory resources: timed out, dumping permit diagnostics:
scylla_node | 2024-02-16T18:48:39.657517001Z permits count memory table/description/state
scylla_node | 2024-02-16T18:48:39.657519135Z 99 99 2643K catalog.candles/data-query/inactive
scylla_node | 2024-02-16T18:48:39.657521110Z 1 1 23K catalog.candles/data-query/active/used
scylla_node | 2024-02-16T18:48:39.657522614Z 13542 0 0B catalog.candles/data-query/waiting
scylla_node | 2024-02-16T18:48:39.657524048Z
scylla_node | 2024-02-16T18:48:39.657525471Z 13642 100 2667K total
scylla_node | 2024-02-16T18:48:39.657526874Z
scylla_node | 2024-02-16T18:48:39.657528218Z Total: 13642 permits with 100 count and 2667K memory resources
scylla_node | 2024-02-16T18:48:39.657529608Z
scylla_node | 2024-02-16T18:49:09.660815001Z INFO 2024-02-16 18:49:09,659 [shard 0] reader_concurrency_semaphore - (rate limiting dropped 3834 similar messages) Semaphore _read_concurrency_sem with 100/100 count and 2490677/170750115 memory resources: timed out, dumping permit diagnostics:
scylla_node | 2024-02-16T18:49:09.660830225Z permits count memory table/description/state
scylla_node | 2024-02-16T18:49:09.660832250Z 96 96 2M catalog.candles/data-query/inactive
scylla_node | 2024-02-16T18:49:09.660833827Z 4 4 446K catalog.candles/data-query/active/used
scylla_node | 2024-02-16T18:49:09.660835255Z 14723 0 0B catalog.candles/data-query/waiting
scylla_node | 2024-02-16T18:49:09.660836701Z
scylla_node | 2024-02-16T18:49:09.660838136Z 14823 100 2432K total
scylla_node | 2024-02-16T18:49:09.660839841Z
scylla_node | 2024-02-16T18:49:09.660841341Z Total: 14823 permits with 100 count and 2432K memory resources
scylla_node | 2024-02-16T18:49:09.660846879Z
scylla_node | 2024-02-16T18:49:39.696854632Z INFO 2024-02-16 18:49:39,695 [shard 0] reader_concurrency_semaphore - (rate limiting dropped 3459 similar messages) Semaphore _read_concurrency_sem with 97/100 count and 2051096/170750115 memory resources: timed out, dumping permit diagnostics:
scylla_node | 2024-02-16T18:49:39.696872216Z permits count memory table/description/state
scylla_node | 2024-02-16T18:49:39.696874342Z 96 96 2M catalog.candles/data-query/inactive
scylla_node | 2024-02-16T18:49:39.696875936Z 1 1 17K catalog.candles/data-query/active/used
scylla_node | 2024-02-16T18:49:39.696877410Z 15861 0 0B catalog.candles/data-query/waiting
scylla_node | 2024-02-16T18:49:39.696878964Z
scylla_node | 2024-02-16T18:49:39.696880298Z 15958 97 2003K total
scylla_node | 2024-02-16T18:49:39.696881662Z
scylla_node | 2024-02-16T18:49:39.696882960Z Total: 15958 permits with 97 count and 2003K memory resources
scylla_node | 2024-02-16T18:49:39.696884320Z
scylla_node | 2024-02-16T18:50:09.710129016Z INFO 2024-02-16 18:50:09,709 [shard 0] reader_concurrency_semaphore - (rate limiting dropped 3087 similar messages) Semaphore _read_concurrency_sem with 97/100 count and 2907535/170750115 memory resources: timed out, dumping permit diagnostics:
scylla_node | 2024-02-16T18:50:09.710183713Z permits count memory table/description/state
scylla_node | 2024-02-16T18:50:09.710185973Z 92 92 1903K catalog.candles/data-query/inactive
scylla_node | 2024-02-16T18:50:09.710187518Z 5 5 936K catalog.candles/data-query/active/used
scylla_node | 2024-02-16T18:50:09.710188918Z 8056 0 0B catalog.candles/data-query/waiting
scylla_node | 2024-02-16T18:50:09.710190339Z
scylla_node | 2024-02-16T18:50:09.710191737Z 8153 97 2839K total
scylla_node | 2024-02-16T18:50:09.710193132Z
scylla_node | 2024-02-16T18:50:09.710194475Z Total: 8153 permits with 97 count and 2839K memory resources
scylla_node | 2024-02-16T18:50:09.710195897Z
Scylla runs on a single node, the configuration is simple (--memory=8G, --smp=1
). I looked at the load - there are enough resources on the server. The Scylla container is constantly running at 100+% CPU.
If you run it in only one process, everything works without errors. CPU load 90+%.
Do I understand correctly that the problem is the use of multiprocessing and incorrect configuration (do you need to allocate so much CPU so that it does not exceed 100% of the load on the database?)?