Scylla Manager backup dry run fails with: giving up after 2 attempts: after 30s: context deadline exceeded

We are setting up scylladb cluster to be backed up in AWS S3 compatible storage and setup access_key, secet, endpoint in scylla-manager-agent.yaml correctly. From each node the command also works fine:

scylla-manager-agent check-location --debug --location s3:

But when we try and test backup from scyllamanger node it fails with:

Error: get backup target: location is not accessible

sctool backup -c ‘Scylla Prod 01’ -L ‘s3:’ --dry-run

NOTICE: this may take a while, we are performing disk size calculations on the nodes

2a0b:b580:0:0:xx:xxxx:xx:xx: giving up after 2 attempts: after 30s: context deadline exceeded
2a0b:b580:0:0:xx:xxxx:xx:xx: giving up after 2 attempts: after 30s: context deadline exceeded

May 08 13:08:47 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:47.365Z","N":"backup","M":"Generating backup target","cluster_id":"fe769e49-2639-4426-8829-77d5ccc20ae2","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:08:47 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:47.594Z","N":"cluster.client","M":"Checking hosts connectivity","hosts":["2a0b:b580:0:0:xx:xxxx:xx:xx","22a0b:b580:0:0:xx:xxxx:xx:xx","2a0b:b580:0:0:xx:xxxx:xx:xx","2a0b:b580:0:0:xx:xxxx:xx:xx","2a0b:b580:0:0:xx:xxxx:xx:xx"],"_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:08:47 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:47.594Z","N":"cluster.client","M":"Host check OK","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:08:47 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:47.594Z","N":"cluster.client","M":"Host check OK","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:08:47 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:47.594Z","N":"cluster.client","M":"Host check OK","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:08:47 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:47.594Z","N":"cluster.client","M":"Host check OK","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:08:47 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:47.594Z","N":"cluster.client","M":"Host check OK","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:08:47 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:47.594Z","N":"cluster.client","M":"Done checking hosts connectivity","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:08:47 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:47.594Z","N":"backup","M":"Checking accessibility of remote locations","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:08:59 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:59.832Z","N":"cluster","M":"Creating new Scylla REST client","cluster_id":"fe769e49-2639-4426-8829-77d5ccc20ae2"}
May 08 13:08:59 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:08:59.884Z","N":"cluster.client","M":"Measuring datacenter latencies","dcs":["SCY-PRD1"]}
May 08 13:09:17 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:17.596Z","N":"cluster.client","M":"HTTP retry backoff","operation":"OperationsCheckPermissions","wait":"1s","error":"after 30s: context deadline exceeded","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:09:29 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:29.832Z","N":"cluster","M":"Creating new Scylla REST client","cluster_id":"fe769e49-2639-4426-8829-77d5ccc20ae2"}
May 08 13:09:29 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:29.890Z","N":"cluster.client","M":"Measuring datacenter latencies","dcs":["SCY-PRD1"]}
May 08 13:09:48 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:48.597Z","N":"backup","M":"Location check FAILED","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","location":"s3:<Bucket Name>","error":"giving up after 2 attempts: after 30s: context deadline exceeded","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:09:48 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:48.597Z","N":"backup","M":"Location check FAILED","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","location":"s3:<Bucket Name>","error":"giving up after 2 attempts: after 30s: context deadline exceeded","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:09:48 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:48.597Z","N":"backup","M":"Location check FAILED","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","location":"s3:<Bucket Name>","error":"giving up after 2 attempts: after 30s: context deadline exceeded","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:09:48 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:48.597Z","N":"backup","M":"Location check FAILED","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","location":"s3:<Bucket Name>","error":"giving up after 2 attempts: after 30s: context deadline exceeded","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:09:48 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:48.597Z","N":"backup","M":"Location check FAILED","host":"2a0b:b580:0:0:xx:xxxx:xx:xx","location":"s3:<Bucket Name>","error":"giving up after 2 attempts: after 30s: context deadline exceeded","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:09:48 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:48.597Z","N":"backup","M":"Done checking accessibility of remote locations","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:09:48 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:48.597Z","N":"http","M":"GET /api/v1/cluster/Scylla%20Prod%2001/tasks/backup/target","from":"127.0.0.1:59140","status":400,"bytes":566,"duration":"61235ms","error":"get backup target: location is not accessible: 2a0b:b580:0:0:xx:xxxx:xx:xx: giving up after 2 attempts: after 30s: context deadline exceeded; 2a0b:b580:0:0:xx:xxxx:xx:xx: giving up after 2 attempts: after 30s: context deadline exceeded; 2a0b:b580:0:0:xx:xxxx:xx:xx: giving up after 2 attempts: after 30s: context deadline exceeded; 2a0b:b580:0:0:xx:xxxx:xx:xx: giving up after 2 attempts: after 30s: context deadline exceeded; 2a0b:b580:0:0:xx:xxxx:xx:xx: giving up after 2 attempts: after 30s: context deadline exceeded","_trace_id":"L9hVLdVmQBCl40aj9f0zkQ"}
May 08 13:09:59 prd-scylladbmanager scylla-manager[954628]: {"L":"INFO","T":"2023-05-08T13:09:59.832Z","N":"cluster","M":"Creating new Scylla REST client","cluster_id":"fe769e49-2639-4426-8829-77d5ccc20ae2"}



The scylladb node manager-agent has these errors:

May 08 13:09:17 prd-scylladb1 scylla-manager-agent[3135]: {"L":"ERROR","T":"2023-05-08T13:09:17.596Z","N":"rclone","M":"Location check: error=no put permission: context canceled","S":"github.com/scylladb/go-log.Logger.log\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:101\ngithub.com/scylladb/go-log.Logger.Error\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:84\ngithub.com/scylladb/scylla-manager/v3/pkg/rclone.RedirectLogPrint.func1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/logger.go:19\ngithub.com/rclone/rclone/fs.LogPrintf\n\tgithub.com/rclone/rclone@v1.51.0/fs/log.go:152\ngithub.com/rclone/rclone/fs.Errorf\n\tgithub.com/rclone/rclone@v1.51.0/fs/log.go:167\ngithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver.rcCheckPermissions\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver/rc.go:497\ngithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver.Server.ServeHTTP\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver/rcserver.go:258\nnet/http.StripPrefix.func1\n\tnet/http/server.go:2152\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*Mux).Mount.func1\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:311\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*Mux).routeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:436\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*Mux).ServeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:70\ngithub.com/go-chi/chi/v5.(*Mux).Mount.func1\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:311\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/scylladb/scylla-manager/v3/pkg/auth.ValidateToken.func1.1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/auth/auth.go:49\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*ChainHandler).ServeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/chain.go:31\ngithub.com/go-chi/chi/v5.(*Mux).routeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:436\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5/middleware.RequestLogger.func1.1\n\tgithub.com/go-chi/chi/v5@v5.0.0/middleware/logger.go:57\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*Mux).ServeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:87\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2947\nnet/http.(*conn).serve\n\tnet/http/server.go:1991"}
May 08 13:09:17 prd-scylladb1 scylla-manager-agent[3135]: {"L":"ERROR","T":"2023-05-08T13:09:17.596Z","N":"rclone","M":"rc: \"operations/check-permissions\": error: no put permission: context canceled","S":"github.com/scylladb/go-log.Logger.log\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:101\ngithub.com/scylladb/go-log.Logger.Error\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:84\ngithub.com/scylladb/scylla-manager/v3/pkg/rclone.RedirectLogPrint.func1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/logger.go:19\ngithub.com/rclone/rclone/fs.LogPrintf\n\tgithub.com/rclone/rclone@v1.51.0/fs/log.go:152\ngithub.com/rclone/rclone/fs.Errorf\n\tgithub.com/rclone/rclone@v1.51.0/fs/log.go:167\ngithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver.Server.writeError\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver/rcserver.go:81\ngithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver.Server.ServeHTTP\n\tgithub.com/scylladb/scylla-manager/v3/pkg/rclone/rcserver/rcserver.go:265\nnet/http.StripPrefix.func1\n\tnet/http/server.go:2152\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*Mux).Mount.func1\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:311\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*Mux).routeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:436\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*Mux).ServeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:70\ngithub.com/go-chi/chi/v5.(*Mux).Mount.func1\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:311\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/scylladb/scylla-manager/v3/pkg/auth.ValidateToken.func1.1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/auth/auth.go:49\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*ChainHandler).ServeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/chain.go:31\ngithub.com/go-chi/chi/v5.(*Mux).routeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:436\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5/middleware.RequestLogger.func1.1\n\tgithub.com/go-chi/chi/v5@v5.0.0/middleware/logger.go:57\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\ngithub.com/go-chi/chi/v5.(*Mux).ServeHTTP\n\tgithub.com/go-chi/chi/v5@v5.0.0/mux.go:87\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2947\nnet/http.(*conn).serve\n\tnet/http/server.go:1991"}
May 08 13:09:17 prd-scylladb1 scylla-manager-agent[3135]: {"L":"INFO","T":"2023-05-08T13:09:17.596Z","N":"http","M":"POST /agent/rclone/operations/check-permissions","from":"[2a0b:b580::10:121:64:10]:59866","status":404,"bytes":150,"duration":"30001ms"}

@roy.susmit Thank you for reporting this issue.

Please find the scylla-manager git repository GitHub - scylladb/scylla-manager: The Scylla Manager
That’s the best place to create issues related to scylla manger.

I miss a bit of the information here ,like:

  • what version of scylla-manager are you using
  • what is the output of scylla-manager-agent check-location -L s3:<location>

The error message you attached: error: no put permission: context canceled is misleading in the context of the issue reported here.

The real root cause is the timeout that comes from the scylla-manager. When you perform sctool backup -c ‘Scylla Prod 01’ -L ‘s3:’ --dry-run then it’s the scylla-manager that calls scylla-manager-agent REST API to check the permissions.
Scylla manager calls to scylla-manager-agent REST API are timed out after 30s.
Scylla-manager-agent tried to interact with S3 API to put a file into the bucket, but the call took more than 30s and was basically cancelled.

scylla-manager-agent check-location --debug --location s3: calls are not timed out. They are performed directly on the manager-agent nodes without anything in between the caller and the manager-agent.
Pls let us know how much time it took to check the location from the node directly.

I suspect that the S3 compatible storage response time is > 30s.