After restoring a backup that uses Velero, the ClickHouse operator pod may start before all required access control and dependent resources are fully restored. In this condition, the operator can log temporary authorization errors when attempting to read or manage Kubernetes resources such as StatefulSets, ClickHouseInstallation (CHI) objects, and PersistentVolumeClaims. In some environments, the operator does not resume normal reconciliation automatically after those resources become available, and the CHI remains in InProgress until the operator pod is restarted.
Typical symptoms include:
InProgressThis issue can be observed in environments that use:
Velero restores resources one at a time and, by default, restores resources in a predefined order that includes CustomResourceDefinitions, Namespaces, PersistentVolumes, PersistentVolumeClaims, Secrets, ConfigMaps, and ServiceAccounts. Resources not explicitly listed in the restore priority sequence are appended afterward in alphabetical order unless the Velero server is configured with a custom --restore-resource-priorities value.
Because the ClickHouse Operator depends on a ServiceAccount and RBAC permissions to watch and manage Kubernetes objects, the operator can start during the restore window before all required permissions and related resources are fully available. ClickHouse Operator requires a ServiceAccount with privileges to create and destroy multiple Kubernetes objects.
If the operator starts too early, it may encounter temporary authorization failures while Velero is still restoring the required objects. In some cases, the operator does not recover cleanly after those objects become available, and reconciliation remains stalled until the operator pod is restarted. This behavior is consistent with a restore sequencing problem affecting operator startup and reconciliation timing.
If the issue has already occurred and the CHI remains in InProgress, restart the ClickHouse operator pod or rollout restart the deployment. In affected environments, this reinitializes the operator after all restored resources are present and allows reconciliation to continue.
As a permanent solution, you can customize Velero restore order. Velero supports a custom restore ordering through the --restore-resource-priorities flag on the Velero server. This setting applies to future restores. Resources not included in the custom list are appended afterward in alphabetical order.
Where operationally appropriate, adjust restore priorities so that the operator does not become runnable before its required resources, especially:
Because restore order is configured globally on the Velero server, evaluate the impact on other restored applications before changing it. Velero recommends using the default restore order unless customization is required.
Velero documents the default restore order as:
Velero relevant doc: https://velero.io/docs/main/restore-reference/#restore-order