Cluster VIP Flagged as Rogue IP by Cisco ACI Fabric
search cancel

Cluster VIP Flagged as Rogue IP by Cisco ACI Fabric

book

Article ID: 402694

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • Experiencing ~5% packet loss when pinging Aria Operations for Logs cluster VIP.
  • Packet captures confirmed that the individual Aria Logs nodes were sending ARP replies for the VIP. This behavior is incorrect for their network's Direct Server Return (DSR) configuration, which expects only the load balancer to handle ARP for the VIP.
  • Cisco ACI network was flagging the VIP as a "rogue endpoint" due to the IP address appearing to move rapidly between the different cluster nodes.

Environment

Aria Operations for Logs 8.x

Cause

The ACI network was not configured in advance to correctly handle the normal Gratuitous ARP (GARP) announcements from multiple Aria Logs nodes for the shared VIP. As a result, ACI's default security feature interpreted this standard cluster behavior as a problematic "rogue IP" movement. The root cause appears to be a configuration mismatch between the Cisco ACI network and the expected behavior of the active-active Aria Logs cluster.

Resolution

The solution is to configure the Cisco ACI network to correctly identify and tolerate the Aria Logs VIP behavior.
Note: Disabling ARP on the Aria Logs nodes is not the correct or supported solution.
 
Procedure:
  1. Configure the VIP in ACI: Proactively define the Aria Logs VIP as an L4-L7 Virtual IP within the ACI tenant's Application Profile EPG.
  2. Create a MAC Exception List: Add the MAC addresses of all Aria Logs cluster nodes to the ACI System -> System Settings -> Endpoint Controls -> Rogue/COOP Exception List. This increases the network's tolerance for IP moves from these specific MACs.
  3. Enable GARP-based EP move detection on the ACI Bridge Domain.
  4. Order is Critical: These ACI configurations must be applied before the VIP is used and begins receiving production traffic. If this is done in the wrong order, the IP may be flagged as rogue, and the configuration may not take effect until the rogue entry ages out.