Hi, we faced with problem of degradation of consistency level under the stress tests.
Problem: an error “Cassandra.InvalidQueryException: SERIAL is not supported as conditional update commit consistency. Use ANY if you mean “make sure it is accepted but I don’t care how many replicas commit it for non-SERIAL reads”” was occured, but there is no setup of CL = Serial in our client code.
Initial setup:
Cluster configuration:
6 nodes, RF = 3, CL = LocalQuorum
var builder = Cluster
.Builder()
...
.WithRetryPolicy(new DefaultRetryPolicyWithConfiguredWriteTimeoutAttempts(3))
.WithExecutionProfiles("default",
profile => profile.WithConsistencyLevel(
ConsistencyLevel.LocalQuorum));
DefaultRetryPolicyWithConfiguredWriteTimeoutAttempts code:
public sealed class DefaultRetryPolicyWithConfiguredWriteTimeoutAttempts : IExtendedRetryPolicy, IRetryPolicy
{
private readonly DefaultRetryPolicy m_defaultRetryPolicy = new();
private readonly int m_maxWriteAttemptsCount;
public DefaultRetryPolicyWithConfiguredWriteTimeoutAttempts(
int maxWriteAttemptsCount)
{
Require.Positive(maxWriteAttemptsCount, nameof(maxWriteAttemptsCount));
m_maxWriteAttemptsCount = maxWriteAttemptsCount;
}
public RetryDecision OnReadTimeout(
IStatement query,
ConsistencyLevel cl,
int requiredResponses,
int receivedResponses,
bool dataRetrieved,
int nbRetry)
{
return m_defaultRetryPolicy.OnReadTimeout(query, cl, requiredResponses, receivedResponses, dataRetrieved, nbRetry);
}
public RetryDecision OnWriteTimeout(
IStatement query,
ConsistencyLevel cl,
string writeType,
int requiredAcks,
int receivedAcks,
int nbRetry)
{
return nbRetry >= m_maxWriteAttemptsCount
? RetryDecision.Rethrow()
: RetryDecision.Retry(cl);
}
public RetryDecision OnUnavailable(
IStatement query,
ConsistencyLevel cl,
int requiredReplica,
int aliveReplica,
int nbRetry)
{
return m_defaultRetryPolicy.OnUnavailable(query, cl, requiredReplica, aliveReplica, nbRetry);
}
public RetryDecision OnRequestError(
IStatement statement,
Configuration config,
Exception ex,
int nbRetry)
{
return m_defaultRetryPolicy.OnRequestError(statement, config, ex, nbRetry);
}
}
Driver: cassandra c# driver v3.18.0
Problem details: After 2 minutes of stress test we found errors like:
Server failure during write query at consistency SERIAL (1 responses were required but only 0 replica responded, 1 failed)
Cassandra.InvalidQueryException: SERIAL is not supported as conditional update commit consistency. Use ANY if you mean "make sure it is accepted but I don't care how many replicas commit it for non-SERIAL reads"
Known Issue: all nodes except 1 was unavailable during network connectivity errors:
dbug: Cassandra.Connections.Connection[0]
Disposing Connection #59109011 to 10.51.0.6:9042.
info: Cassandra.Connections.HostConnectionPool[0]
Connection to 10.51.0.6:9042 could not be created: System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Cassandra.Connections.TcpSocket.<Connect>d__34.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Cassandra.Connections.Connection.<DoOpen>d__85.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Cassandra.Connections.Connection.<Open>d__84.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Cassandra.Connections.HostConnectionPool.<DoCreateAndOpen>d__46.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Cassandra.Connections.HostConnectionPool.<CreateOpenConnection>d__60.MoveNext()
warn: Cassandra.Host[0]
Host 10.51.0.6:9042 considered as DOWN.
Could you help with understanding why scylla/cassandra driver decided to change consistency level to serial in that situation?