[RELEASE] Scylla Manager 3.2.0

The ScyllaDB team announces the release of ScyllaDB Manager 3.2 production-ready ScyllaDB Manager minor release of the stable ScyllaDB Manager 3.2 branch.

ScyllaDB Manager is a centralized cluster administration and recurrent tasks automation tool.

Scylla Manager 3.2 includes performance and safety improvement to recurrent repair and automate manual parts of the restore procedure.

ScyllaDB Enterprise customers are encouraged to upgrade to ScyllaDB Manager 3.2 in coordination with the ScyllaDB support team.

The new release includes upgrades of both ScyllaDB Manager Server and Agent.

Useful Links:

Scylla Manager 3.2 support the following Scylla Enterprise releases:

  • 2023.1
  • 2022.1
  • 2022.2

And the following Open Source release (limited to 5 nodes see license):

  • 5.2
  • 5.1

You can install and run Scylla Manager on Kubernetes using Scylla Operator. More here.

Restore Automation

Scylla Manager Restore command allows you to restore backed-up data into a cluster.

In Manager 3.1 release, one had to run the following commands manually:

  • Restore table with sctool restore -restore-tables
  • Repair the cluster with sctool repair
  • Restore tables tombstone_gc option to the original value
  • Drop and recreate all Materialized Views and Secondary Indexes

These steps are now automated as part of the sctool restore option.

See Manager 3.2 docs here

Additional restore metrics

  • progress - Defines current percentage progress of the restore process. Labeled by cluster and snapshot_tag.
  • scylla_manager_restore_view_build_status - Defines the current status of the recreated view of the restored table. Labeled by cluster, keyspace and view. It is interpreted as the following enumeration:
    • 0 - “UNKNOWN”
    • 1 - “STARTED”
    • 2 - “SUCCESS”
    • 3 - “ERROR”

Repair Improvements

Repair is a recurrent offline task that synchronizes data across all data replicas.

Scylla Manager automates the repair process and allows you to configure how and when repair occurs.

When you create a cluster, a repair task is automatically scheduled. This task is set to occur each week by default, but you can change it to another time, change its parameters or add additional repair tasks if needed.

This release changes the parallelism and order of the repair job for better performance and stability. The following changes have been made:

  • Only one repair job is running on any host at any given time.
  • A Repair job always starts on the host with the smallest shards (cores) count.
  • Repair waits for the previous table to be fully repaired before advancing to the next one. Tables order is as follows:
    • repair internal (system prefix) tables before regular tables
    • repair smaller keyspaces and tables first
    • repair base tables before views
  • Internal system keyspace “system-traces” is not repaired. Traces are short-lived, and repairing them is wasteful.

Sctool repair flag updates

  • The default value of --keyspace flag has been changed to *,!system_traces. (see above)
  • Intensity option values between 0,1 are ignored (default to 1)

Configuration updates

The following parameter in scylla-manager.yaml

  • Deprecated repair option poll_interval, long_polling_timeout_seconds - Scylla Manager uses synchronous API endpoint to wait for repair results.
  • Deprecated repair option force_repair_type, murmur3_partitioner_ignore_msb_bits - the only supported repair type is row_level.
  • Update Backup option: age_max has been changed from 12h to 24h.

Age_max is the maximum time for a backup run to be considered fresh and can be continued from the same snapshot. If exceeded, a new run with a new snapshot will be created.

Monitoring

You can use Scylla Monitoring releases 4.4.4 and later to monitor Scylla Manager 3.2.

Known Issue

When upgrading from Manager 3.1 to Manager 3.2, the progress percentages of a repair task created before the upgrade can show a value greater than 100% - only for the first post-upgrade run. The problem is limited to the progress metrics, not the actual repair. #3534

1 Like