Compute Manager shows "Invalid credentials" and remains DOWN after a CM edit

search cancel

Compute Manager shows "Invalid credentials" and remains DOWN after a CM edit

book

Article ID: 435118

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

After a Compute Manager (CM) edit operation—typically triggered by VCF SDDC Manager updating vCenter credentials or configuration—the Compute Manager in the NSX UI (System > Fabric > Compute Managers) displays the following symptoms:

Connection Status: DOWN
Error Message: Invalid credentials provided or extension not valid for compute manager <vc-hostname> with id <cm-id>
The issue persists indefinitely (37+ hours) without auto-recovery.
No configuration change was made to vCenter credentials after the edit.
Other Compute Managers in the same NSX cluster may remain unaffected.

Log Indicators

In /var/log/cm-inventory/cm-inventory.log on the owner NSX node, the following patterns appear:

"Error occurred during login by certificate as extension for cm <cm-id>"
"VC Version is below 9, hence skipping service account consistency checker" (Note: This is a false negative; the login failure causes the version check to be skipped).

Environment

VMware NSX 9.0.X

Cause

This issue is caused by a race condition and a secondary version-detection bug during the NSX extension re-registration process:

Certificate Mismatch: When a CM edit occurs on a non-owner NSX node, it generates a new extension certificate in vCenter and updates the shared database (CorfuDB). However, the owner node (responsible for the live connection) continues using a stale in-memory keystore with the old certificate.

Signal Overwrite: A race condition causes the "restart" signal sent to the owner node to be overwritten by the owner's own status update before the restart is triggered. Consequently, the owner node never refreshes its connection with the new certificate.

Recovery Failure: When the login fails, the ServiceAccountConsistencyChecker incorrectly assumes the vCenter version is below 9.0, causing it to skip the automatic service account recovery path.

Resolution

Restart the cm-inventory service on the OWNER NSX node only.

This is a deterministic fix. No retry is needed.

Identify the owner node
Restart cm-inventory on the owner node

ssh root@<OWNER_NSX_NODE_IP>
/etc/init.d/cm-inventory restart

It temporarily pauses inventory collection from vCenter (~2-3 minute) while the service restarts.

Wait approximately 2 to 3 minutes for recovery

The cm-inventory service starts up, detects no live connection for the CM, and automatically establishes a fresh connection using the correct certificate from the database.

Verify recovery

In NSX UI: System → Fabric → Compute Managers
The affected CM should show Connection Status: UP with no errors.

To confirm from logs on the owner node:
grep "Plugin Info.*<CM_ID>" /var/log/cm-inventory/cm-inventory.log | tail -5

Expected output:
Plugin Info: ... pluginStatus='STARTED', cmConnectionStatus='UP', errors='[]'

Feedback

thumb_up Yes

thumb_down No