NSX Edge fails to connect to the broker after upgrade
search cancel

NSX Edge fails to connect to the broker after upgrade

book

Article ID: 321102

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:

  • NSX Edge fails to connect to broker after upgrade.
  • In the /var/log/rabbitmq/[email protected] file, when the NSX Edge is trying to connect to broker, you see entries similar to:

    =ERROR REPORT==== 16-Nov-2017::02:42:11 ===
    Error on AMQP connection <0.24620.25> (50.6.0.39:16053 -> 10.114.12.246:5671, user: 'vse_50394af8-####-####-####-########6fe', state: opening):
    access to vhost 'vshield' refused for user 'vse_50394af8-####-####-####-########6fe'

     
  • In the VSFWD logs on the ESXi host, you see entries similar to:

    2017-11-16T13:18:02Z vsfwd: [INFO] Connected to 23DB6AF8:36985.
    2017-11-16T13:18:02Z vsfwd: [INFO] Edge VMCI: 36 <--> Broker NET: 38:Port 29477
    2017-11-16T13:18:02Z vsfwd: [ERROR] Queuing as Unable to Write 197 Bytes for Edge Client Type VMCI fd 38 error ret -1 errno 107
    2017-11-16T13:18:02Z vsfwd: [INFO] Edge (VMCI:NET) Socket 36 hung up


    Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.



Environment

VMware NSX for vSphere 6.3.x

Cause

This issue occurs while upgrading the NSX Manager (and hence the RMQ broker), permission for the NSX Edge user created on the broker is getting deleted.

Resolution

To resolve this issue, after the NSX Manager upgrade, if some of the Edges are not able to connect to the broker, do the following:

  1. If hosts are not able to connect, resync them through REST API:

    POST https:NSXMGR_IP/api/2.0/nwfabric/configure?action=synchronize
     
    <nwFabricFeatureConfig>
        <featureId>com.vmware.vshield.vsm.messagingInfra</featureId>
        <resourceConfig>
            <resourceId>host-15</resourceId>
        </resourceConfig>
    </nwFabricFeatureConfig>

     
  2. If Edges are not able to connect to the broker, confirm if users are available on the broker:

    a. Run this command on the NSX Manager to see the number of Edge users on the broker:

         rabbitmqctl list_users |grep -i vse_

         vse_50394d8b-####-####-####-########068
         vse_5039f49a-####-####-####-########ae30


    b. Compare this list from all the deployed Edge and see if it has users for all the Edges. If a user is missing from the broker, then redeploy the Edge.
  3. Assuming all the users are present in step 2, check if permissions are set on the broker for each Edge:

    a.  Run this command on the NSX Manager to see how many Edges have permission set on the broker:

    rabbitmqctl list_permissions -p vshield|grep -i vse_

    vse_########-####-####-####-########9593 ^vse.*|^amq\.gen.*|^amq\.default|^vsm.* ^vse.*|^amq\.gen.*|^amq\.default ^vse.*|^amq\.gen.*|^amq\.default|^vsm.*
    vse_########-####-####-####-########9d58 ^vse.*|^amq\.gen.*|^amq\.default|^vsm.* ^vse.*|^amq\.gen.*|^amq\.default ^vse.*|^amq\.gen.*|^amq\.default|^vsm.*


    b. Compare this list and see if permission for any of the Edge is missing. Run this command to create permission for the Edge user whose permission is missing on the broker:

    curl -H "Content-Type: application/json" -k -u vsm:KwXoVpTX64 -X PUT -d '{"configure":"^vse.*|^amq\.gen.*|^amq\.default|^vsm.*", "write":"^vse.*|^amq\.gen.*|^amq\.default", "read":"^vse.*|^amq\.gen.*|^amq\.default|^vsm.*"}' https://localhost:15671/api/permissions/vshield/vse_502cdddf-####-####-####-########ec5

    Where:

    vsm:KwXoVpTX64 - username:password for admin account on broker. you can get this by running this command:

    /home/secureall/secureall/sem/WEB-INF/classes/GetVsmRabbitPassword.sh