Originally from the User Slack
@Cong_Guo: Hi there, I am trying to remove a node via curl -X POST “http://127.0.0.1:10000/storage_service/remove_node/xxx” running in scylla 6.2.2, but see below exception, do you know why?
{"message": "std::bad_alloc (std::bad_alloc)", "code": 500}
@avi: Probably the node is sick, check its logs
@Cong_Guo: I once tried to decommission one node, but it triggered other nodes to restart suddenly, before that I see some warning logs like
...when communicating with 10.0.0.102, to read from mykeyspace.account_id: std::bad_alloc
and I do see bad collection design warning like
WARN 2025-04-25 07:44:19,915 [shard 10: mt] large_data large_data - Writing large collection xxx (645810 bytes)
They are some nodes in the DN status, but not majority. No clue if it caused the bad_alloc when running decommission.
something like (running on 6.2.2)
UN 10.0.8.30 138.46 GB 256 ? d79a228d-b2ad-4338-a485-46e1fe2a38a9 b
?N 10.0.8.41 107.75 GB 256 ? 50952491-1f3c-4566-aaf7-ed80d358da84 b
UN 10.0.8.57 122.21 GB 256 ? b675cff3-c8ae-4dac-b790-a50996c97ecf b
?N 10.0.8.61 ? 256 ? 207124c2-6787-4393-a5d9-e384572987ea b
UN 10.0.8.81 123.77 GB 256 ? df14e6c7-042a-471a-bf0b-f672769614d0 b
?N 10.0.8.95 ? 256 ? 0d713634-d335-4f63-9326-eed49ffffe6f b
UN 10.0.8.98 126.55 GB 256 ? 03f0bd4a-d37c-4871-81ae-a96e13bf287c b
UN 10.0.9.134 124.93 GB 256 ? 03adacdf-979c-4805-bb85-9691ffc7a61b b
UN 10.0.9.141 109.86 GB 256 ? 2b36cd1e-2342-40ca-bfff-2cadb4722831 b
UN 10.0.9.159 135.48 GB 256 ? 096107b0-94e9-4da1-bc84-92ec9a26ee07 b
UN 10.0.9.161 129.42 GB 256 ? c067abea-563d-4ddd-95c0-e0edb1b9a298 b
My shallow thought is the bad collection design (non frozen list) consumes too much memory which impact the memory claim needed for decommission.
@avi: Check if you have large collections or large rows in the system.large_* tables
@Cong_Guo: Yes, it does.
Some bad designed schema like unfrozen list.
But I am a little worried that should not trigger so many nodes to restart, especially the majority nodes restarts.
Actually see below log when triggered decommission, that should be the cause of nodes restart.
Curious why gossip negotiation needs so much memory?
N7seastar12continuationINS_8internal22promise_base_with_typeIvEENS_6futureIvE12finally_bodyIZNS_3smp9submit_toIZNS_7shardedIN3gms8gossiperEE9invoke_onINS_20noncopyable_functionIFS5_RSB_EEEJES5_Qsr3stdE9invocableITL0__RT_DpOTL0_0_EEET1_jNS_21smp_submit_to_optionsEOSJ_DpOT0_EUlvE_EENS_8futurizeINSt13invoke_resultISJ_JEE4typeEE4typeEjSP_SQ_EUlvE_Lb0EEEZNS5_17then_wrapped_nrvoIS5_S12_EENSV_ISJ_E4typeEOT0_EUlOS3_RS12_ONS_12future_stateINS1_9monostateEEEE_vEE
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<seastar::with_semaphore<seastar::semaphore_default_exception_factory, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}, std::chrono::_V2::steady_clock>(seastar::basic_semaphore<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>&, unsigned long, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}&&)::{lambda(auto:1)#1}::operator()<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock> >(seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>)::{lambda()#1}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void>::finally_body<seastar::with_semaphore<seastar::semaphore_default_exception_factory, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}, std::chrono::_V2::steady_clock>(seastar::basic_semaphore<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>&, unsigned long, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}&&)::{lambda(auto:1)#1}::operator()<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock> >(seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>)::{lambda()#1}, false> >(gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<seastar::with_semaphore<seastar::semaphore_default_exception_factory, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(auto:1&&) const::{lambda()#1}, std::chrono::_V2::steady_clock>(seastar::basic_semaphore<auto:1, auto:3>&, unsigned long, auto:2&&)::{lambda(auto:1)#1}::operator()<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock> >(auto:1)::{lambda()#1}, false>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
INFO 2025-04-24 00:58:19,024 [shard 0: gms] storage_service - handle_state_normal: endpoint=10.5.212.183 == current_owner=10.5.212.183 token -1274646323398053987
08:58:19.030
realtime-streaming
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<seastar::with_semaphore<seastar::semaphore_default_exception_factory, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}, std::chrono::_V2::steady_clock>(seastar::basic_semaphore<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>&, unsigned long, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}&&)::{lambda(auto:1)#1}::operator()<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock> >(seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>)::{lambda()#1}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void>::finally_body<seastar::with_semaphore<seastar::semaphore_default_exception_factory, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}, std::chrono::_V2::steady_clock>(seastar::basic_semaphore<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>&, unsigned long, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}&&)::{lambda(auto:1)#1}::operator()<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock> >(seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>)::{lambda()#1}, false> >(gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(gms::inet_address&) const::{lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<seastar::with_semaphore<seastar::semaphore_default_exception_factory, gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0::operator()<gms::inet_address&>(auto:1&&) const::{lambda()#1}, std::chrono::_V2::steady_clock>(seastar::basic_semaphore<auto:1, auto:3>&, unsigned long, auto:2&&)::{lambda(auto:1)#1}::operator()<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock> >(auto:1)::{lambda()#1}, false>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
08:58:19.030
realtime-streaming
seastar::coroutine::parallel_for_each<gms::gossiper::apply_state_locally(std::map<gms::inet_address, gms::endpoint_state, std::less<gms::inet_address>, std::allocator<std::pair<gms::inet_address const, gms::endpoint_state> > >)::$_0>
08:58:19.030
realtime-streaming
seastar::internal::coroutine_traits_base<void>::promise_type
08:58:19.030
realtime-streaming
seastar::internal::coroutine_traits_base<void>::promise_type
08:58:19.030
realtime-streaming
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<seastar::internal::invoke_func_with_gate<gms::gossiper::background_msg(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::noncopyable_function<seastar::future<void> (gms::gossiper&)>)::$_0>(seastar::gate::holder&&, gms::gossiper::background_msg(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::noncopyable_function<seastar::future<void> (gms::gossiper&)>)::$_0&&)::{lambda()#1}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void>::finally_body<seastar::internal::invoke_func_with_gate<gms::gossiper::background_msg(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::noncopyable_function<seastar::future<void> (gms::gossiper&)>)::$_0>(seastar::gate::holder&&, gms::gossiper::background_msg(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::noncopyable_function<seastar::future<void> (gms::gossiper&)>)::$_0&&)::{lambda()#1}, false> >(seastar::future<void>::finally_body<seastar::internal::invoke_func_with_gate<gms::gossiper::background_msg(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::noncopyable_function<seastar::future<void> (gms::gossiper&)>)::$_0>(seastar::gate::holder&&, gms::gossiper::background_msg(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::noncopyable_function<seastar::future<void> (gms::gossiper&)>)::$_0&&)::{lambda()#1}, false>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<seastar::internal::invoke_func_with_gate<gms::gossiper::background_msg(seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::noncopyable_function<seastar::future<void> (gms::gossiper&)>)::$_0>(seastar::gate::holder&&, auto:1&&)::{lambda()#1}, false>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
08:58:19.030
realtime-streaming
seastar::internal::coroutine_traits_base<void>::promise_type
08:58:19.030
realtime-streaming
N7seastar12continuationINS_8internal22promise_base_with_typeIvEENS_6futureIvE12finally_bodyIZNS_3smp9submit_toIZNS_7shardedIN3gms8gossiperEE9invoke_onINS_20noncopyable_functionIFS5_RSB_EEEJES5_Qsr3stdE9invocableITL0__RT_DpOTL0_0_EEET1_jNS_21smp_submit_to_optionsEOSJ_DpOT0_EUlvE_EENS_8futurizeINSt13invoke_resultISJ_JEE4typeEE4typeEjSP_SQ_EUlvE_Lb0EEEZNS5_17then_wrapped_nrvoIS5_S12_EENSV_ISJ_E4typeEOT0_EUlOS3_RS12_ONS_12future_stateINS1_9monostateEEEE_vEE
08:58:19.030
realtime-streaming
N7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_6futureIvE16handle_exceptionIZZN3gms8gossiper14background_msgENS_13basic_sstringIcjLj15ELb1EEENS_20noncopyable_functionIFS5_RS8_EEEEN3$_0clEvEUlT_E_Qoooooosr3stdE16is_invocable_r_vINS4_ISG_EETL0__NSt15__exception_ptr13exception_ptrEEaaeqsr3stdE12tuple_size_vINSt11conditionalIXsr3stdE9is_same_vINS1_18future_stored_typeIJSG_EE4typeENS1_9monostateEEESt5tupleIJEESS_IJSQ_EEE4typeEELi0Esr3stdE16is_invocable_r_vIvSK_SM_Eaaeqsr3stdE12tuple_size_vISW_ELi1Esr3stdE16is_invocable_r_vISG_SK_SM_Eaagtsr3stdE12tuple_size_vISW_ELi1Esr3stdE16is_invocable_r_vISW_SK_SM_EEES5_OSG_EUlSX_E_ZNS5_17then_wrapped_nrvoIS5_SY_EENS_8futurizeISG_E4typeEOT0_EUlOS3_RSY_ONS_12future_stateISR_EEE_vEE
08:58:19.030
realtime-streaming
seastar::internal::coroutine_traits_base<void>::promise_type
08:58:19.030
realtime-streaming
ERROR 2025-04-24 00:58:19,030 [shard 0: gms] gossip - Gossip change listener failed: std::bad_alloc (std::bad_alloc), at: 0x617db0e 0x617e120 0x617e428 0x5c4276e 0x5c42927 0x3c06ad3 0x142301a 0x5c806ff 0x5c81c7a 0x5c82e57 0x5c82208 0x5c12ff3 0x5c12353 0x13ccfe5 0x13ce9a0 0x13cb403 /opt/scylladb/libreloc/libc.so.6+0x2a087 /opt/scylladb/libreloc/libc.so.6+0x2a14a 0x13c8a84
08:58:19.031
realtime-streaming
0x142301a
08:58:19.031
realtime-streaming
Aborting on shard 0, in scheduling group gossip.
@avi: It’s likely https://github.com/scylladb/scylladb/issues/23781
GitHub: managed_bytes
violates preferred contiguous allocation size · Issue #23781 · scylladb/scylladb
@Cong_Guo: Guess below problem is caused by the same issue?
https://scylladb-users.slack.com/archives/C2NLNBXLN/p1745586119428079
raft::state_machine_error
NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 10.5.211.160:9042 ldc1>: <Error from server: code=0000 [Server error] message="Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1360): std::bad_alloc (std::bad_alloc)"">})
I am afraid if I trigger any topology change will bring the cluster to down for raft module seems doesn’t work caused by bad_alloc.
[April 25th, 2025 6:01 AM] cong.guo: Hi, do you know that does this error mean when trying to change the replica number of system_auth?
raft::state_machine_error
NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 10.5.211.160:9042 ldc1>: <Error from server: code=0000 [Server error] message="Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1360): std::bad_alloc (std::bad_alloc)"">})
@avi: Very likely
@Cong_Guo: Just curious, if this error means raft instance is stopped, why the process is still running? Will raft instance still recover to running later? If stopping the read/write to make more room in the Non-LSA, will it be a chance for the alloc to be success for a while?
message="Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1360): std::bad_alloc (std::bad_alloc)"">}
@avi: Don’t know enough about it