Long connection times, Connection timeout using Rust driver, connection ports with Docker

Guy · May 6, 2025, 8:04am

Originally from the User Slack

@Mattia: Hello everyone
I am trying ScyllaDB using the official rust driver. I have spawned 3 nodes in a cluster using docker compose and I build the Session from rust code.
The problem I’m having is that the function that builds the connection blocks the executing until the timeout specified with .connection_timeout(Duration::from_secs(..)) runs out, even if the connection is fine. For example if I set the timeout to 30 seconds I have to wait exactly 30 seconds before having a successful connection to the db.

Is this the expected behavior? Thanks

@Karol_Baryła: Definitely not an expected behavior. Could you share what is your OS, how exactly are you setting up the cluster, and how are you connecting to it?

@Mattia: Sure, thanks for the reply.

System:
Linux 6.14.3-zen1-1-zen x86-64 (Arch Linux)
docker compose:
services:
  scylla1:
    image: scylladb/scylla
    container_name: scylla1
    command: --smp 1
    ports:
      - "9042:9042"

  scylla2:
    image: scylladb/scylla
    container_name: scylla2
    command: --smp 1 --seeds=scylla1

  scylla3:
    image: scylladb/scylla
    container_name: scylla3
    command: --smp 1 --seeds=scylla1
I’m using docker-compose up -d with podman as the backend.

In Rust:
async fn db_connect(conf: DBConf) -> Result<Session> {
    let known_nodes = conf.nodes();

    let session: Session = SessionBuilder::new()
        .known_nodes(known_nodes)
        .connection_timeout(Duration::from_secs(30))
        .compression(Some(Compression::Lz4))
        .build()
        .await?;

    Ok(session)
}
With tracing I see that it waits for x seconds before exiting the function, with x determined by the connection_timeout func
Also, it is a debug build

@Karol_Baryła: The problem is most likely that scylla2 and scylla3 are listening on localhost (in their own network namespaces), and broadcasting this address, so the driver can’t connect to them. For such docker setup you need to set up rpc address and listen address (and remove the “ports” section from scylla1). See for example the docker compose from tests of Rust Driver: https://github.com/scylladb/scylla-rust-driver/blob/main/test/cluster/docker-compose.yml

GitHub: scylla-rust-driver/test/cluster/docker-compose.yml at main · scylladb/scylla-rust-driver

@Mattia: Thank you I will look into it. I got the docker compose from the docker’s hub page but what you are saying makes sense
Just as a note: I’m only supplying “127.0.0.1:9042” as an address to the SessionBuilder so I thought it was sufficient to expose the port of one of the containers and everything would work.

@Karol_Baryła: It would indeed allow the driver to connect to this one node. However the driver tries to connect to each node (each shard in fact). Connection process is roughly as follows:
• One of provided contact points is chosen as control connection
• Driver tries to connect to it
• If successfull, driver then fetches information about cluster from system tables using this connection.
• This allows the driver to learn about other nodes, to which it then tries to connect (this is the step that is hanging in your case)
• After connecting, the Session is returned.
If you want the driver to really connect only to this single node (which is usually a very bad idea, and only makes sense for tools like cqlsh) you could use https://docs.rs/scylla/latest/scylla/policies/host_filter/trait.HostFilter.html ( https://docs.rs/scylla/latest/scylla/client/session_builder/type.SessionBuilder.html#method.host_filter ) to prevent the driver from connecting to other nodes.

HostFilter in scylla::policies::host_filter - Rust

SessionBuilder in scylla::client::session_builder - Rust

@Mattia: Thank you so much for the explanation and your time. I will later try with the compose you have linked and see how it goes.
For some reasons I get that the scylla1 node is unhealthy when trying to bring up the cluster. Any idea?

@Karol_Baryła: Did you remove the ports section from scylla1?

@Mattia: Yes, I have completely removed the first file

@Karol_Baryła: Can you show me the full docker compose?

@Mattia: Sure:
networks
  public:
    name: scylla_node
    driver: bridge
    ipam:
      driver: default
      config:
        - subnet: 172.42.0.0/16
services:
  node-scylla1:
    image: scylladb/scylla
    networks:
      public:
        ipv4_address: 172.42.0.2
    container_name: node-scylla1
    command: |
      --rpc-address 172.42.0.2
      --listen-address 172.42.0.2
      --seeds 172.42.0.2
      --skip-wait-for-gossip-to-settle 0
      --ring-delay-ms 0
      --smp 2
      --memory 1G
    healthcheck:
      test: [ "CMD", "cqlsh", "node-scylla1", "-e", "select * from system.local WHERE key='local'" ]
      interval: 5s
      timeout: 5s
      retries: 60
  node-scylla2:
    image: scylladb/scylla
    networks:
      public:
        ipv4_address: 172.42.0.3
    container_name: node-scylla2
    command: |
      --rpc-address 172.42.0.3
      --listen-address 172.42.0.3
      --seeds 172.42.0.2
      --skip-wait-for-gossip-to-settle 0
      --ring-delay-ms 0
      --smp 2
      --memory 1G
    healthcheck:
      test: [ "CMD", "cqlsh", "node-scylla2", "-e", "select * from system.local WHERE key='local'" ]
      interval: 5s
      timeout: 5s
      retries: 60
    depends_on:
      node-scylla1:
        condition: service_healthy
  node-scylla3:
    image: scylladb/scylla
    networks:
      public:
        ipv4_address: 172.42.0.4
    container_name: node-scylla3
    command: |
      --rpc-address 172.42.0.4
      --listen-address 172.42.0.4
      --seeds 172.42.0.2,172.42.0.3
      --skip-wait-for-gossip-to-settle 0
      --ring-delay-ms 0
      --smp 2
      --memory 1G
    healthcheck:
      test: [ "CMD", "cqlsh", "node-scylla3", "-e", "select * from system.local WHERE key='local'" ]
      interval: 5s
      timeout: 5s
      retries: 60
    depends_on:
      node-scylla2:
        condition: service_healthy

Guy · May 7, 2025, 3:25am

@Mattia: Hello, I am trying to make a contribution to ScyllaDB rust driver but I am unable to start the cluster to test the changes I have made.
I am using the Makefile and running make test or make ci but both docker and podman hangs indefinitely (I think on the healthcheck of the first container).

Can anyone help me with this? (this is the docker compose used)

GitHub: scylla-rust-driver/test/cluster/docker-compose.yml at 00856129603b39676c9a31c26626d0ad5ec19c68 · scylladb/scylla-rust-driver

@Karol_Baryła: Sorry for leaving you without response on a previous thread! Tbh this cluster sometimes (but rarely) also fails to start for me. In that case it is usually enough to try again (make down && make up ). I never really investigated why that happens. You can check logs of this node with docker logs cluster-scylla1-1 and link them here (use pastebin / gist, or send a file), maybe we’ll find something there.

@Mattia: Unfortunately I have tried restarting it many times, I will try to share the logs now
Here are the file for the logs from the container:
stdout.txt (2.9 KB)
stderr.txt (109.1 KB)

Even tough after 355 seconds the job stops due to unhealthy condition in scylla1 if I run nodetools I get the following status:
docker exec -it cluster-scylla1-1 nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address    Load      Tokens Owns Host ID                              Rack
UN 172.42.0.2 269.25 KB 256    ?    5efe6b87-70b0-46a8-a92f-978e7067d027 rack1
@Karol_Baryła: I have no idea why it doesn’t start. Scylla logs look fine I think. What happens if you execute the healthcheck command inside the container manually?

@Mattia: Doing

docker exec -it cluster-scylla-1 /bin/bash

cqlsh -e "select * from system.local WHERE key='local'"
Returns a row correctly
Do you think I should run this with root privileges?
Like, make test and others

@Karol_Baryła: Definitely without root. docker inspect --format "{{json .State.Health }}" cluster-scylla-1 | jq should show the output of healthcheck commands executed by docker - maybe it will give us some clue.

@Mattia: Thank you, this is rather telling:
{"Status":"starting","FailingStreak":4,"Log":[{"Start":"2025-05-04T18:53:58.152812491+02:00","End":"2025-05-04T18:53:59.241185859+02:00","ExitCode":1,"Output":"Usage: cqlsh [options] [host [port]]\n\ncqlsh: error: '*' is not a valid port number.\n"},{"Start":"2025-05-04T18:54:04.877333293+02:00","End":"2025-05-04T18:54:05.386606825+02:00","ExitCode":1,"Output":"Usage: cqlsh [options] [host [port]]\n\ncqlsh: error: '*' is not a valid port number.\n"},{"Start":"2025-05-04T18:54:10.852202477+02:00","End":"2025-05-04T18:54:11.263149637+02:00","ExitCode":1,"Output":"Usage: cqlsh [options] [host [port]]\n\ncqlsh: error: '*' is not a valid port number.\n"},{"Start":"2025-05-04T18:54:16.874847316+02:00","End":"2025-05-04T18:54:17.332078002+02:00","ExitCode":1,"Output":"Usage: cqlsh [options] [host [port]]\n\ncqlsh: error: '*' is not a valid port number.\n"}]}
@Karol_Baryła: I don’t see how it could possibly interpret * as port number in our compose file Did you modify the compose file? I see that your container name is different than mine, that may have something to do with it.

@Mattia:
version: "3.7"

networks:
  public:
    name: scylla_rust_driver_public
    driver: bridge
    ipam:
      driver: default
      config:
        - subnet: 172.42.0.0/16
services:
  scylla1:
    image: scylladb/scylla
    networks:
      public:
        ipv4_address: 172.42.0.2
    command: |
      --rpc-address 172.42.0.2
      --listen-address 172.42.0.2
      --seeds 172.42.0.2
      --skip-wait-for-gossip-to-settle 0
      --ring-delay-ms 0
      --smp 2
      --memory 1G
    healthcheck:
      test: [ "CMD", "cqlsh", "scylla1", "-e", "select * from system.local WHERE key='local'" ]
      interval: 5s
      timeout: 5s
      retries: 60
  scylla2:
    image: scylladb/scylla
    networks:
      public:
        ipv4_address: 172.42.0.3
    command: |
      --rpc-address 172.42.0.3
      --listen-address 172.42.0.3
      --seeds 172.42.0.2
      --skip-wait-for-gossip-to-settle 0
      --ring-delay-ms 0
      --smp 2
      --memory 1G
    healthcheck:
      test: [ "CMD", "cqlsh", "scylla2", "-e", "select * from system.local WHERE key='local'" ]
      interval: 5s
      timeout: 5s
      retries: 60
    depends_on:
      scylla1:
        condition: service_healthy
  scylla3:
    image: scylladb/scylla
    networks:
      public:
        ipv4_address: 172.42.0.4
    command: |
      --rpc-address 172.42.0.4
      --listen-address 172.42.0.4
      --seeds 172.42.0.2,172.42.0.3
      --skip-wait-for-gossip-to-settle 0
      --ring-delay-ms 0
      --smp 2
      --memory 1G
    healthcheck:
      test: [ "CMD", "cqlsh", "scylla3", "-e", "select * from system.local WHERE key='local'" ]
      interval: 5s
      timeout: 5s
      retries: 60
    depends_on:
      scylla2:
        condition: service_healthy
This is the one I am using, straight from the github repo

@Karol_Baryła: What is the output of docker --version?

@Mattia: Docker version 28.1.1, build 4eba377327

I also tried with podman with the same problem (podman version 5.4.2)

@Karol_Baryła: In that case I’m unfortunately out of ideas, sorry

@Mattia: Thank you for the help anyway! I think I will try to remove the docker-compose/podman layer. That might create some problems I guess
Seems like I was still using podman, for some reasons it took precedence over docker when running the commands. With docker it works. (at least the first container is healthy, now the second seems to be hanging)
@Mattia: Thank you for the help anyway! I think I will try to remove the docker-compose/podman layer. That might create some problems I guess
@Mattia: Seems like I was still using podman, for some reasons it took precedence over docker when running the commands. With docker it works. (at least the first container is healthy, now the second seems to be hanging)

Topic		Replies	Views
Can't connect to the node ScyllaDB error-message , troubleshooting , rust-driver , macos , docker	9	139	January 13, 2025
[RELEASE] ScyllaDB Rust Driver 1.1.0 Release Notes release , drivers , rust-driver , rust , driver-release	0	47	April 3, 2025
[RELEASE] ScyllaDB Rust Driver 0.10.0 ScyllaDB release , drivers , rust-driver , rust	0	190	October 6, 2023
[RELEASE] ScyllaDB CPP-over-Rust Driver 0.5.0 Release Notes release , drivers , driver-release	0	16	June 18, 2025
[RELEASE] ScyllaDB CPP-over-Rust Driver 0.2.0 ScyllaDB drivers , driver-release	0	15	October 4, 2024

Long connection times, Connection timeout using Rust driver, connection ports with Docker

Related topics