Too many in flight hints: 10493465

Hi,

I was seeing a “Too many in flight hints: 10493465” error today. The problem was a 5 minute write peak.

I was wondering if it would be possible to increase max_size_of_hints_in_progress from 10MB to 1000MB and even throttle hints replay.
After the 5 minute peak there is plenty of time to catch up on hints replay. For me it would be no problem if hints replay takes a long time. But I would like to avoid any errors during the peak.

Is there any downside to increasing max_size_of_hints_in_progress?
Since hints are stored on disk, why is there even such a low hints limit?

cheers,
Christian

I think in-flight-hints means hints that are pending to be written to disk. If so, then these in flight hints are helt in memory. Correct?

Hints are kept on disk.

The obvious downside of raising max_size_of_hints_in_progress is that hints will take up more disk space. In general you should not rely on hints for consistency. It is better to run repair regularly and have hints only cover temporary hiccups.

Thanks for the response.

Are you sure that max_size_of_hints_in_progress applies to the hint size on disk? I think it really means the size of hints that are currently being persisted locally (according to my limited understanding of manager::end_point_hints_manager::store_hint). Please correct me if I understand this code wrong:

bool manager::end_point_hints_manager::store_hint(schema_ptr s, lw_shared_ptr<const frozen_mutation> fm, tracing::trace_state_ptr tr_state) noexcept {
    try {
        // Future is waited on indirectly in `stop()` (via `_store_gate`).
        (void)with_gate(_store_gate, [this, s = std::move(s), fm = std::move(fm), tr_state] () mutable {
            ++_hints_in_progress;
            size_t mut_size = fm->representation().size();
            shard_stats().size_of_hints_in_progress += mut_size;

            return with_shared(file_update_mutex(), [this, fm, s, tr_state] () mutable -> future<> {
                return get_or_load().then([this, fm = std::move(fm), s = std::move(s), tr_state] (hints_store_ptr log_ptr) mutable {
                    commitlog_entry_writer cew(s, *fm, db::commitlog::force_sync::no);
                    return log_ptr->add_entry(s->id(), cew, db::timeout_clock::now() + _shard_manager.hint_file_write_timeout);
                }).then([this, tr_state] (db::rp_handle rh) {
                    auto rp = rh.release();
                    if (_last_written_rp < rp) {
                        _last_written_rp = rp;
                        manager_logger.debug("[{}] Updated last written replay position to {}", end_point_key(), rp);
                    }
                    ++shard_stats().written;

                    manager_logger.trace("Hint to {} was stored", end_point_key());
                    tracing::trace(tr_state, "Hint to {} was stored", end_point_key());
                }).handle_exception([this, tr_state] (std::exception_ptr eptr) {
                    ++shard_stats().errors;

                    manager_logger.debug("store_hint(): got the exception when storing a hint to {}: {}", end_point_key(), eptr);
                    tracing::trace(tr_state, "Failed to store a hint to {}: {}", end_point_key(), eptr);
                });
            }).finally([this, mut_size, fm, s] {
                --_hints_in_progress;
                shard_stats().size_of_hints_in_progress -= mut_size;
            });;
        });
    } catch (...) {
        manager_logger.trace("Failed to store a hint to {}: {}", end_point_key(), std::current_exception());
        tracing::trace(tr_state, "Failed to store a hint to {}: {}", end_point_key(), std::current_exception());

        ++shard_stats().dropped;
        return false;
    }
    return true;
}

I was getting this error on the client side of my write-operations. Since hints are not reliable anyway, I would expect them to be silently dropped, instead of making writes fail. The code throws a overloaded-exception, which is pretty hard:

    if (cannot_hint(all, type)) {
        get_stats().writes_failed_due_to_too_many_in_flight_hints++;
        // avoid OOMing due to excess hints.  we need to do this check even for "live" nodes, since we can
        // still generate hints for those if it's overloaded or simply dead but not yet known-to-be-dead.
        // The idea is that if we have over maxHintsInProgress hints in flight, this is probably due to
        // a small number of nodes causing problems, so we should avoid shutting down writes completely to
        // healthy nodes.  Any node with no hintsInProgress is considered healthy.
        throw overloaded_exception(_hints_manager.size_of_hints_in_progress());
    }

Can’t the hint be simply dropped if it exceeds the threshold?

Indeed you are right, max_size_of_hints_in_progress seems to only account for in-memory hints, that are in the process of being written to disk. This is currently a constant in the code and changing it is not a good idea. To be honest I’m even surprised it is a constant value, instead of some percentage of memory (the usual way we express memory limits).
If you are hitting this limit, you disk might have trouble keeping up with the rate of incoming hints.

I was getting this error on the client side of my write-operations. Since hints are not reliable anyway, I would expect them to be silently dropped, instead of making writes fail.

I agree, it doesn’t make sense. I seem to remember this being heatedly discussed in the past. I suggest opening an issue about it, and discuss it further there.

Thanks, ticket created: Too many in_flight_hints should not cause overloaded-exception · Issue #13383 · scylladb/scylladb · GitHub

Lets hope for a heated discussion :wink: