NSX-T GUI for appliances may show incorrect information in regards RAM utilization
search cancel

NSX-T GUI for appliances may show incorrect information in regards RAM utilization

book

Article ID: 322411

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • You are running NSX-T 2.4.x.
  • You recently changed the NSX-T Manager resources from medium to large appliance values, e.g. from 24GB RAM and 6 CPU to 48GB RAM and 12 CPU.
  • In the appliances section of NSX-T Manager GUI: System - Overview, you can see the values displayed for the system RAM are not correct:
RAM incorrect values - cropped.png
  • As we see above the first and second NSX-T Manager nodes are still referring to old values of 24GB and reflecting the percentage of that value. 
  • In the log for the management messaging service /var/log/rabbit/[email protected], on the impacted NSX-T Manager node we can see error messages like the following:
=ERROR REPORT==== 19-Apr-2021::17:56:49 ===
Error on AMQP connection <0.25206.1> (192.168.120.1:43504 -> 192.168.120.1:5671, state: starting):
PLAIN login refused: user 'cvn-mp-mpa-7552d960-a9e3-46c1-87ac-4be07a855b34' - invalid credentials
  • On the impacted NSX-T Manager node in the log /var/log/syslog we can see error messages like the following:
    • The following implies the credentials (RabbitMQ account) are incorrect:
<182>1 2021-04-20T08:25:36.855522Z nsx-ctl-1 NSX 3809 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="3809" level="INFO"] Trying to connect to broker 192.168.120.1:5671
<179>1 2021-04-20T08:25:39.868351Z nsx-ctl-1 NSX 3809 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="3809" level="ERROR" errorCode="MPA1012"] Unable to log on to broker using supplied credentials:192.168.120.1 port:5671 error:Logging in: Input/output error#012
<179>1 2021-04-20T08:25:39.868413Z nsx-ctl-1 NSX 3809 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="3809" level="ERROR" errorCode="MPA1030"] Unable to close connection after cleaning amqp_login
<179>1 2021-04-20T08:25:39.868501Z nsx-ctl-1 NSX 3809 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="3809" level="ERROR" errorCode="MPA1009"] Unable to make 1st connection to broker 192.168.120.1:5671. Inbound Connection returned -1012.
  • The following implies the thumbprint is incorrect:
<182>1 2021-04-22T11:19:40.767151Z nsx-ctl-3 NSX 29080 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="29080" level="INFO"] Trying to connect to broker 127.0.0.1:5671
<179>1 2021-04-22T11:19:40.780078Z nsx-ctl-3 NSX 29080 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="29080" level="ERROR" errorCode="MPA1011"] Unable to Connect to 127.0.0.1 Port 5671 Error Invalid certificate fingerprint
<179>1 2021-04-22T11:19:40.780149Z nsx-ctl-3 NSX 29080 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="29080" level="ERROR" errorCode="MPA1009"] Unable to make 1st connection to broker 127.0.0.1:5671. Inbound Connection returned -1011.


Environment

VMware NSX-T

Cause

This has been seen to happen when the credentials and/or thumbprint are incorrect between the management plane nodes, these are stored in the manager: mpaconfig.json file for the impacted NSX-T manager node:
    "SharedSecret": "DZTST4UweBbnq903NkRcSs9LvshH27omrsBoKlzLQT",
    "RmqClientType": "cvn-mp-mpa",
    "AccountName": "cvn-mp-mpa-7552d960-a9e3-46c1-87ac-4be07a855b34",
    "RmqBrokerCluster": [
        {
            "BrokerFqdn": "",
            "BrokerIpAddress": "192.168.120.1",
            "BrokerPort": "5671",
            "BrokerVirtualHost": "nsx",
            "BrokerSslCertThumbprint": "8129718A7E555A2717CF61E0637130475A7805F899766364EB5AEC8A020E4BFE",
            "BrokerIsMaster": "TRUE"

Resolution

This issue is resolved in NSX-T 2.5.x

Workaround:
To resolve the first issue where the credentials are incorrect we run the following commands.
Log into the impacted NSX-T manager node as root, the below command will restore the mpa configuration file (mpaconfig.json) to default values:
sh /opt/vmware/nsx-mpa/mpaconfigrestore.sh

Then we restart the cluster boot manager service, this will cause a fresh pull down of the correct credentials from corfu:
/etc/init.d/nsx-cluster-boot-manager restart

After this step we may still encounter issues, this is usually due to the thumbprint being incorrect, as we see in the log sample from above:
<182>1 2021-04-22T11:19:40.767151Z nsx-ctl-3 NSX 29080 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="29080" level="INFO"] Trying to connect to broker 127.0.0.1:5671
<179>1 2021-04-22T11:19:40.780078Z nsx-ctl-3 NSX 29080 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="29080" level="ERROR" errorCode="MPA1011"] Unable to Connect to 127.0.0.1 Port 5671 Error Invalid certificate fingerprint
<179>1 2021-04-22T11:19:40.780149Z nsx-ctl-3 NSX 29080 - [nsx@6876 comp="nsx-manager" subcomp="mpa" tid="29080" level="ERROR" errorCode="MPA1009"] Unable to make 1st connection to broker 127.0.0.1:5671. Inbound Connection returned -1011.

To resolve this we need to log in as root on the impacted NSX-T manager node, run the following command:
curl -u 'admin:default' -X GET -H 'X-NSX-Username:admin' -ik http://localhost:7440/nsxapi/api/v1/cluster/nodes > /tmp/cluster_node.out

From the resulting file /tmp/cluster_node.out, we need to search it and find the account name and thumbprint for this impacted node.
Note: The REST API will return data for all 3 manager nodes.

For account name, look for this managers IP address and under this you should then see mgmt_cluster_listen_addr like this and associated port ID 0:
 "mgmt_cluster_listen_addr" : {
        "ip_address" : "192.168.120.1",
        "port" : 0,
...cert here...
      "mpa_msg_client_info" : {
        "account_name" : "cvn-mp-mpa-64f03c82-ad1a-4c28-b9a3-8bfc105073b7"

We see above manager 192.168.120.1 has an account name: cvn-mp-mpa-64f03c82-ad1a-4c28-b9a3-8bfc105073b7
Take note of this account name for later.

Then we need to find the thumbprint for this impacted NSX-T manager node, look for the IP address of the impacted manager and then under it port 5671, this will be followed by a PEM certificate, under that will be a thumbprint, see below example:
"mgmt_plane_listen_addr" : {
        "ip_address" : "192.168.120.1",
        "port" : 5671,  
...certificate here...
        "certificate_sha256_thumbprint" : "6091c64202237523d38038a31d75b4c87d89b7878ff351020a0c7a898e78bf3d"

From above sample we need the thumbprint: 6091c64202237523d38038a31d75b4c87d89b7878ff351020a0c7a898e78bf3d

Now backup the current mpaconfig.json file:
cp /etc/vmware/nsx-mpa/mpaconfig.json /etc/vmware/nsx-mpa/mpaconfig.json.bak

Next edit the mpaconfig.json file and correct the account name and thumbprint to match the ones we collected in above steps
vi /etc/vmware/nsx-mpa/mpaconfig.json

When complete, save and quit:
:wq

Then restart the nsx-mpa service so the changes will take affect:
/etc/init.d/nsx-mpa restart

To validate the changes have worked, you can tail the RabbitMQ log and check the error have gone:
tail -f /var/log/rabbitmq/rabbitmq\@localhost.log