Antrea Egress Intermittent Network Connectivity Loss Due to Multiple Matching Policies
search cancel

Antrea Egress Intermittent Network Connectivity Loss Due to Multiple Matching Policies

book

Article ID: 416739

calendar_today

Updated On:

Products

VMware Container Networking with Antrea

Issue/Introduction

An Openshift cluster utilizing Antrea CNI (version 2.3.1) and the Antrea Egress feature is experiencing intermittent and unpredictable loss of external network connectivity for specific containers.

  • Containers lose the ability to send traffic outside the cluster. Functionality is only restored after restarting the affected application/Pod.
  • Antrea logs show that the IP assigned to a Pod's Egress configuration is being unexpectedly torn down and immediately recreated. This corresponds directly to the connectivity loss.
  • The Egress configuration is intended to enforce a static Source Network Address Translation (SNAT) IP for the selected Pods' outbound traffic.

Environment

VMware Container Networking with Antrea

Cause

The root cause is a fundamental design behavior within the Antrea Egress feature when handling multiple, overlapping policies.

  • The affected Pod's labels match the Pod/Namespace selectors of multiple Antrea Egress objects, and these policies specify different egressIP addresses.
  • Antrea does not apply a priority or weighting system to the Egress resource. The official documentation confirms this behavior (Referenced: https://github.com/antrea-io/antrea/blob/main/docs/egress.md#egressip)
    • "If more than one Egress applies to a Pod and they specify different egressIP, the effective egress IP will be selected randomly."
  • The random selection process leads to unpredictable, spontaneous changes in the assigned Egress IP (the "effective egress IP"). This results in the observed tear-down and re-creation event in the logs, causing the application-level network disruption.

Resolution

The Egress policies must be reconfigured to ensure policy alignment and eliminate the condition that triggers the random selection behavior.

(Choose One)

  • Enforce a Single Policy Match
    Modify the Pod labels or the Egress resource selectors (podSelector/namespaceSelector) so that a specific Pod only matches the criteria of one unique Antrea Egress policy.

  • Ensure Egress IP Consistency:
    If a single Pod must match multiple Egress policies for organizational reasons, ensure that all of those matching Egress policies are configured to use the exact same egressIP address.

Note: The lack of an explicit priority system for the Egress resource is a current design limitation. It is recommended to submit a feature request to the Antrea community for the implementation of an egress priority or weighting system to allow for explicit control over policy precedence in the future.