When performing a Zero Downtime Upgrade (ZDU), the upgrade process fails and produces a timeout error. This prevents the ZDU from completing successfully.
Product: Automic Automation Kubernetes Edition (AAKE)
Version: 26.0.0
Configuration: Agents connected to the Java Communication Process (JCP) from inside the cluster using the internal jcp-ws service.
Note: Agents connected externally through an ingress controller are not impacted by this issue.
This is caused by a known defect in version 26.0.0. During the ZDU, if an agent is connected internally via jcp-ws and is not manually reconnected, the forced disconnection process fails to disconnect after the standard timeout expires. As a result, the agent remains actively connected to the base version of the JCP, which stalls the upgrade and ultimately causes the ZDU to fail with a timeout.
To bypass this issue and allow the ZDU to complete successfully, you must manually interrupt the agent connection using one of the following methods:
Method 1: Kubernetes CLI (Recommended) Manually scale the affected agent deployment(s) down to 0 replicas, and then scale them back up to their desired state.
kubectl scale deployment <agent-deployment-name> --replicas=0 -n <namespace>
kubectl scale deployment <agent-deployment-name> --replicas=1 -n <namespace> (Adjust replica count as needed)
Method 2: Automic Web Interface (AWI)
Log in to the AWI.
Navigate to the Administration perspective.
Go to Agents & Groups > Agents.
Locate the internally connected agents and right-click to manually Disconnect them.
Once the agents are disconnected or restarted, they will connect to the upgraded JCP, allowing the ZDU to proceed.