CA7AAI Task Fails During Server Patching Due to Hardcoded SFTP Retries
search cancel

CA7AAI Task Fails During Server Patching Due to Hardcoded SFTP Retries

book

Article ID: 440576

calendar_today

Updated On:

Products

Automation Analytics & Intelligence

Issue/Introduction

The CA7AAI z/OS data provider task (AIS7SRVR) fails with a hard error when the target AAI server is rebooted for patching or maintenance. The task terminates and requires a manual restart once the server is back online.

Symptoms

The task fails with the following messages in the job log:

  • AI7.SXP2E: Call to AISZXSFT failed RC=357 - Check STDERR for messages
  • AI7.TM02S: CA 7 Server for AAI has failed at line 818 - Review LOG messages
  • AIS7SRVR FAILED

The SSH STDERR may show connection errors such as:

  • EDC5140I Broken pipe
  • kex_exchange_identification: write: EDC5140I Broken pipe

Cause

The failure is caused by a hardcoded retry limit in the AISZXSFT REXX module. The CA7AAI task attempts to deliver report data every 30 seconds (by default). If the SFTP connection fails, the product follows a fixed retry sequence:

  1. Initial Attempt: Fails immediately if the SSH daemon is unavailable.
  2. Retry 1: Performed after a 30-second sleep.
  3. Retry 2: Performed after a 60-second sleep.

If the server remains unavailable after the second retry (a total window of approximately 90 seconds), the task reaches a "retries exhausted" state and terminates. In many patching scenarios, a server reboot can take longer than 90 seconds, leading to a fatal task failure

Resolution

There is currently no configuration parameter in the ISPF panels or configuration files to adjust the number of retries or the wait intervals, as these values are hardcoded in the AISZXSFT source code.

Workarounds

  • Increase Cycle Frequency: Increase the interval at which the CA7 Data Provider sends event files to AAI (e.g., from 30 seconds to 60 or 90 seconds). This can provide a longer buffer for the server to recover during a reboot before the 90-second retry window is exhausted.
  • Automated Restart: Configure an automated operations tool (such as CA OPS/MVS) to detect the AI7.TM02S failure message and automatically restart the AIS7SRVR task.
  • Scheduled Maintenance: Coordinate reboots during windows of low activity or manually stop/start the task around the maintenance window.

Product Enhancement

A Product Enhancement Request has been submitted to make these retry intervals and counts configurable in future releases