NSX-T Upgrade via SDDC Manager cannot be started - Error: "NSX-T Manager is in error state due to audit failure. Please run upgrade pre-checks before proceeding with upgrade."
search cancel

NSX-T Upgrade via SDDC Manager cannot be started - Error: "NSX-T Manager is in error state due to audit failure. Please run upgrade pre-checks before proceeding with upgrade."

book

Article ID: 382103

calendar_today

Updated On:

Products

VMware SDDC Manager

Issue/Introduction

  • When applying NSX-T upgrade via SDDC Manager UI the wizard cannot progress passed NSX Edge Cluster section due to error 
    NSX-T Manager is in error state due to audit failure. Please run upgrade pre-checks before proceeding with upgrade

     

  • Error from SDDC Manager UI:


  • Error in /var/log/vmware/vcf/lcm/lcm-debug.log
    DEBUG [vcf_lcm,66f12f9ce151188fb26054d17a3f827c,de45,auditId=1c1306b3-a288-468b-94e7-489ccbed20b5,r
    esourceType=NSX_T_MANAGER,resourceId='nsxt-example.com',name='nsxt-example.com'] [c.v.e.s.l.p.i.n.NsxtInventoryLoader,vac-scheduler-1] Loading NSX Transport Nodes Details for 'nsxt-example.com'
    
    ERROR [vcf_lcm,66f12f9ce151188fb26054d17a3f827c,de45,auditId=1c1306b3-a288-468b-94e7-489ccbed20b5,resourceType=NSX_T_MANAGER,resourceId='nsxt-example.com',name='nsxt-example.com'] [c.v.v.c.n.s.c.c.ComplexHelpers,vac-scheduler-1] Exception occurred during NSX API invocation
    java.util.concurrent.ExecutionException: com.vmware.vapi.std.errors.InternalServerError: InternalServerError (com.vmware.vapi.std.errors.internal_server_error) => {
        messages = [],
        data = struct => {error_message=General error has occurred., details=java.lang.NullPointerException, error_code=100, module_name=common-services},
        errorType = INTERNAL_SERVER_ERROR
    }

     

    ERROR [vcf_lcm,880e16bf7d1c492c,fab4] [c.v.e.s.l.s.impl.UpgradeServiceImpl,http-nio-127.0.0.1-7400-exec-1] NSX Audit on the resource nsxt-example.com is in failed state, setting resourceHealth as ERROR
    [vcf_lcm,880e16bf7d1c492c,fab4] [c.v.e.s.l.s.impl.UpgradeServiceImpl,http-nio-127.0.0.1-7400-exec-1] NSX Audit on the resource nsxt-example.com is in failed state, setting upgradeStatus as ERROR
    [vcf_lcm,880e16bf7d1c492c,fab4] [c.v.v.l.r.a.c.v.u.UpgradableController,http-nio-127.0.0.1-7400-exec-1] Upgrade Objects {"bundleId":"1c145a20-1f19-4393-bec3-b424d12245b5","domainId":"yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyy342c","nsxtManagerCluster":{"id":"nsxt-example.com","name":"nsxt-example.com","resourceHealth":"ERROR","version":"4.2.0.0.0-24105817","upgradeStatus":"ERROR"}}

 

  • Looking at the SDDC Manager DB, in the NSX-T table in the entry for the cluster the Audit status show an error:
    auditError": {                                                                                                 +
                  |     "errorCode": "Failed to load NSX Cluster from the Inventory",                                                 +
                  |     "errorDetails": "error_message : Failed to load NSX Cluster from the Inventory, httpStatus : , error_code : 0"


    Command to check NSXT details in SDDC Manager platform DB

    /usr/pgsql/13/bin/psql -h localhost -U postgres -c "\x" -c "select id,status,version,cluster_fqdn,configuration from nsxt where cluster_fqdn='nsxt-example.com'"

    Sample output

    id            | xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx5e4c
    status        | ACTIVE
    version       | 4.2.0.0.0-24105817
    cluster_fqdn  | nsxt-example.com
    configuration | {                                                                                                                 +
                  |   "domainIds": [                                                                                                  +
                  |     "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyy342c"                                                                        +
                  |   ],                                                                                                              +
                  |   "vcfId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx5e4c",                                                                +
                  |   "nsxtComputeManagers": {},                                                                                      +
                  |   "nsxtHostClusters": {},                                                                                         +
                  |   "nsxtEdgeClusters": {},                                                                                         +
                  |   "managerIpsFqdnMap": {                                                                                          +
                  |     "<nsxt manager ip>": "nsxt1.example.com",                                                                +
                  |     "<nsxt manager ip>": "nsxt2.example.com",                                                                +
                  |     "<nsxt manager ip>": "nsxt3.example.com"                                                                 +
                  |   },                                                                                                              +
                  |   "resourceMapper": {},                                                                                           +
                  |   "auditSucceeded": false,                                                                                        +
                  |   "auditError": {                                                                                                 +
                  |     "errorCode": "Failed to load NSX Cluster from the Inventory",                                                 +
                  |     "errorDetails": "error_message : Failed to load NSX Cluster from the Inventory, httpStatus : , error_code : 0"+
                  |   },                                                                                                              +
                  |   "resourceId": "nsxt-example.com",                                                                     +
                  |   "resourceName": "nsxt-example.com",                                                                   +
                  |   "version": {                                                                                                    +
                  |     "version": "4.2.0.0.0-24105817"                                                                               +
                  |   },                                                                                                              +
                  |   "upgradeAvailable": false                                                                                       +
                  | }
    

     

  • Audit user in NSX-T manager shows "password must be changed"
    root@nsx01:~# chage -l audit
    Last password change                                    : Password must be changed
    Password expires                                        : Password must be changed
    Password inactive                                       : Password must be changed
    Account expires                                         : never
    Minimum number of days between password change          : 0
    Maximum number of days between password change          : 9990
    Number of days of warning before password expires       : 7

Environment

VMware Cloud Foundation 5.x

Cause

Expired audit user in NSX-T Manager

Resolution

  1. SSH to NSX-T manager VIP address with admin user
  2. Switch to root
    st e
  3. Check audit user expiry details and you should see "password must be changed"
    chage -l audit

    Sample output
    root@nsx01:~# chage -l audit
    Last password change                                    : Password must be changed
    Password expires                                        : Password must be changed
    Password inactive                                       : Password must be changed
    Account expires                                         : never
    Minimum number of days between password change          : 0
    Maximum number of days between password change          : 9990
    Number of days of warning before password expires       : 7


  4. Run the command to stop the server
    /etc/init.d/nsx-mp-api-server stop
  5. Run the command
    touch /var/vmware/nsx/reset_cluster_credentials
     
  6. Run the command to start the server
    /etc/init.d/nsx-mp-api-server start
  7. If issue persists with audit user then reboot the NSX-T Manager appliances and recheck the audit user expiry details as stated in Step # 3
  8. Retry upgrade of NSX-T Manager from SDDC UI