NSX ALB Adapter Triggers ‘Adapter Instance is Not Receiving Data’ alert once every 24 hours
search cancel

NSX ALB Adapter Triggers ‘Adapter Instance is Not Receiving Data’ alert once every 24 hours

book

Article ID: 439796

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • The alert consistently occurs every 24 hours with an approximately 5-minute interval offset, and it is automatically resolved during the next collection cycle.

  • In the Aria Operations logs, a 500 Internal Server Error is observed, along with the message: "Unable to find cluster."

    YYYY-MM-DDTHH:MM:SS,XXX+0000 ERROR [Collector worker thread X] (XXX) com.vmware.vcops.NSXAdvancedLBAdapter.getCluster - Unable to fetch cluster details
    com.vmware.vcops.exception.AviApiException: org.springframework.web.client.HttpServerErrorException$InternalServerError: 500 INTERNAL SERVER ERROR: "<h1>Server Error (500)</h1>"
            at com.vmware.vcops.client.AviApi.get(AviApi.java:209) ~[nsx-alb.jar:?]
            at com.vmware.vcops.client.AviApi.get(AviApi.java:125) ~[nsx-alb.jar:?]
            at com.vmware.vcops.client.AviClient.getCluster(AviClient.java:50) ~[nsx-alb.jar:?]
            ...
    YYYY-MM-DDTHH:MM:SS,XXX++0000 ERROR [Collector worker thread X] (XXX) com.vmware.vcops.NSXAdvancedLBAdapter.collect - Unable to find cluster

  • In the NSX Advanced ALB portal.access.log, requests to /api/cluster are consistently succeeding with HTTP 200 responses. However, intermittently, HTTP 500 and 401 responses are observed before the requests return to succeeding with HTTP 200 responses again.

    XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX - [DD/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster/ HTTP/1.1" 200 ....
    XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX - [DD/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster/ HTTP/1.1" 200 ....
    XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX - [DD/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster HTTP/1.1" 500 ....
    XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX - [DD/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster HTTP/1.1" 401 ....
    XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX - [DD/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster HTTP/1.1" 200 ....
    XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX - [DD/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster HTTP/1.1" 200 ....

Environment

Aria Operations 8.18.x
NSX Advanced Load Balancer Adapter 1.3
NSX Advanced Load Balancer 

Cause

By default, the Avi API is configured to forcibly close all API sessions every 24 hours through the following avi_config setting:

"api_force_timeout": 24,

In the API portal logs from the NSX AVI, we can see that once the API session is terminated, the Aria Operations appliance continues sending API requests to AVI. These requests, which include multiple API endpoints, subsequently fail with HTTP 401 "Unauthorized" errors.

The timestamps in the AVI portal logs correlate with the errors observed in Aria Operations.

XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX - [30/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster HTTP/1.1" 401 ....
XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX - [29/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster HTTP/1.1" 401 ....
XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX -[28/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster HTTP/1.1" 401 ....
XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX -[27/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster HTTP/1.1" 401 ....
XX.XX.XX.XX [cache:-] 127.0.0.1:6000 [-] - T-ID=XXXXXXXXXXXXXXXXXXXXXXXX -[26/MMM/YYYY:HH:MM:SS +0000] [-] [-] "GET /api//cluster HTTP/1.1" 401 ....

Resolution

VMware Engineering is aware of this issue and is currently working on a solution.
There is no wokaround.

Additional Information

「NSX ALB アダプタで『Adapter Instance is Not Receiving Data』アラートが、約 24 時間ごとに 1 回発生する