Originally from the User Slack
@Marcondes_Viana_de_Oliveira_Junior: Hey, one question, I’m connecting to a 3 nodes cluster with a key space using replica of 3 and CL of local quorum. I intentionally killed one node to test and I got some logs: Cannot achieve consistency level for cl LOCAL_ONE. Requires 1, alive 0
, but the other two nodes seems alive. When I connect to the cluster I pass all ips.
@Karol_Baryła: How are nodes distributed between DCs and what is your exact replication strategy?
@Marcondes_Viana_de_Oliveira_Junior: Only one DC
NetworkTopologyStrategy
I’m using golang and I set this to the client:
fallback := gocql.RoundRobinHostPolicy()
c.PoolConfig.HostSelectionPolicy = gocql.TokenAwareHostPolicy(fallback)
Can it lock a host? So it don’t try other hosts?
Also, I just tried to recreate the connection and it fails.
Fails to start even I pass 3 hosts, one is down It do not connects
@avi: Check if you misspelled the datacenter name at the client side
@Marcondes_Viana_de_Oliveira_Junior:
unable to discover protocol version: dial tcp x.x.x.x:9042: connect: connection refused
IP is correct.
Is the IP that I killed, even I pass 3 ips to the client connection.
The other 2 IPS are live
DC is correct
unable to discover protocol version: Cannot achieve consistency level for cl LOCAL_ONE. Requires 1, alive 0
we have 2 alive and one dead, and the ips are correct.
also dc
I added some logs in the lib and it fails to get the protocol version
host ["x.x.x.2"]: Cannot achieve consistency level for cl LOCAL_ONE. Requires 1, alive 0
host ["x.x.x.1"]: dial tcp x.x.x.1:9042: connect: connection refused
host ["x.x.x.3"]: Cannot achieve consistency level for cl LOCAL_ONE. Requires 1, alive 0
It only logs the latest error, that’s why I was seeing different errors, now It’s more clear, but I still don’t understand
@avi: Double-check the replication factor
@Marcondes_Viana_de_Oliveira_Junior: Is 3
@avi: I don’t have an explanation then, if a node was alive enough to respond, it is alive enough to be a replica.
Are you using tablets?
@Marcondes_Viana_de_Oliveira_Junior: Not sure. I asked the responsible for the deployment.
$ nodetool status -h x.x.x.2 --keyspace my_keyspace
Datacenter: DTC3
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
DN x.x.x.1 793.81 MB 256 100.0% <UUID> rack-01
UN x.x.x.2 787.60 MB 256 100.0% <UUID> rack-01
UN x.x.x.3 759.51 MB 256 100.0% <UUID> rack-01
I can perform queries using cqlsh
But the go client can’t even connect, fails in this protocol version request
@avi: Then the problem is somewhere in the client or networking
@Marcondes_Viana_de_Oliveira_Junior: Ok, I will keep digging
@avi: wireshark may help
@Marcondes_Viana_de_Oliveira_Junior: One user can login other don’t, when I try to list the roles from cqlsh I get:
my_user@cqlsh> list roles;
NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: x.x.x.2:9042 DC1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level for cl QUORUM. Requires 1, alive 0" info={\'consistency\': \'QUORUM\', \'required_replicas\': 1, \'alive_replicas\': 0}')})
the only diff between the user is that one can only read and the other can read/modify
@avi: What version are you running?
@Marcondes_Viana_de_Oliveira_Junior:
I guess this is the problem, auth is spread
Version: 6.0.2-0.20240703.c9cd171f426e
@avi: But in 6.0 auth was moved to raft
@Marcondes_Viana_de_Oliveira_Junior: i got version from nodetool status --version
wrong place?
@avi: no, it’s the right place
you can fix the problem by increasing the replication factor of system_auth and running repair (this is documented), but it should be in raft in 6.0
@Marcondes_Viana_de_Oliveira_Junior: Is there a place where raft is configurable so I can check if is it on/off?
@avi: It’s automatic
https://github.com/scylladb/scylladb/commit/19b816bb68292b2a5ff7d8e8ec374ceb0d5ed85e
GitHub: Merge ‘Migrate system_auth to raft group0’ from Marcin Maliszkiewicz · scylladb/scylladb@19b816b
@Marcondes_Viana_de_Oliveira_Junior: 
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
This is the current configuration
@avi: Was this cluster upgraded, or did it begin life as 6.0?
@Marcondes_Viana_de_Oliveira_Junior: This I need to ask the devops guy, not available now. But my bet it was not upgraded.
@avi: Well you can fix it with ALTER KEYSPACE + repair
@Marcondes_Viana_de_Oliveira_Junior: Sure
Appreciate you. Tomorrow I will try the fix and let you know!
nodetool repair system_auth -full
Right?
@avi: Yes
@Marcondes_Viana_de_Oliveira_Junior: Thanks again
You were right. It was born in 5 and migrated to 6. Is there a tool/tutorial to migrate auth from system_auth to raft?
We found the docs, Thanks!
Worked, thank you!!