NSX host preparation failing - Timed out waiting for host to join NSX Manager.
search cancel

NSX host preparation failing - Timed out waiting for host to join NSX Manager.

book

Article ID: 388847

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Host preparation fails at 48% after waiting for a while. Failed to install software on host. Operation timed out. The logs suggest an attempt to lock the syslog file in exclusive mode is failing as another host has a lock on the file in exclusive mode already.

Cause

If logdir_unique=true (in /etc/vmsyslog.conf) vmsyslogd will try to create a directory with the FQDN of the host as name inside the syslog directory. In this case it is getting denied because the syslog volume is already locked by another host.

/etc/vmsyslog.conf
-------------------

[vmsyslog]
logdir_unique=true

vmkernel logs suggest an attempt to lock the syslog file in an EXCLUSIVE mode failed because this file was locked in EXCLUSIVE mode by some process / application on another ESXi host.

The owner process / application that had this file opened in an EXCLUSIVE mode was running on an ESXi server that had an NIC with MAC address 00:25:b5:11:01:0e attached to it:

...
2025-01-13T10:01:20.904Z cpu46:2097739)DLX: 4579: vol 'XXXXXXXX', lock at 91619328: [Req mode 1] Checking liveness:
2025-01-13T10:01:20.904Z cpu46:2097739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
gen 29, mode 1, owner XXXXXXXX mtime 418851
num 0 gblnum 0 gblgen 0 gblbrk 0]
2025-01-13T10:01:24.514Z cpu46:2097739)WARNING: DLX: 4524: Sleep and recheck lock completes with POLL_NO_LOCK_CHANGE_WAIT on XXXXXXXX, no lock change has happened, status: Wait interrupted
2025-01-13T10:01:25.302Z cpu46:2097739)DLX: 4579: vol 'XXXXXXXX', lock at 91619328: [Req mode 1] Checking liveness:
2025-01-13T10:01:25.302Z cpu46:2097739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
gen 29, mode 1, owner XXXXXXXX mtime 418851
num 0 gblnum 0 gblgen 0 gblbrk 0]
2025-01-13T10:01:28.341Z cpu46:2097739)WARNING: DLX: 4524: Sleep and recheck lock completes with POLL_NO_LOCK_CHANGE_WAIT on XXXXXXXX, no lock change has happened, status: Wait interrupted
2025-01-13T10:01:28.795Z cpu46:2097739)DLX: 4579: vol 'XXXXXXXX', lock at 91619328: [Req mode 1] Checking liveness:
2025-01-13T10:01:28.795Z cpu46:2097739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
gen 29, mode 1, owner XXXXXXXX mtime 418851
num 0 gblnum 0 gblgen 0 gblbrk 0]
2025-01-13T10:01:32.821Z cpu48:2097739)DLX: 5244: vol 'XXXXXXXX', lock at 91619328: Lock type: 10C00001. [Req mode: 1] Not free, pollStat 3:
2025-01-13T10:01:32.821Z cpu48:2097739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
gen 29, mode 1, owner XXXXXXXX mtime 418851
num 0 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0
2025-01-13T10:01:32.821Z cpu48:2097739)DLX: 2650: vol 'XXXXXXXX', lock at 91619328: Lock type: 10C00001. Exclusive Lock(s) held on a file on volume XXXXXXXX. numHolders:0 gblNumHolders:0, volume state 10, APD s$
2025-01-13T10:01:32.821Z cpu48:2097739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
gen 29, mode 1, owner XXXXXXXX mtime 418851
num 0 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0
2025-01-13T10:01:32.821Z cpu48:2097739)DLX: 2651: vol 'XXXXXXXX', lock at 91619328: Lock type: 10C00001. owner(s) MAC: XXXXXXXX:
2025-01-13T10:01:32.821Z cpu48:2097739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
gen 29, mode 1, owner XXXXXXXX mtime 418851
num 0 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0
2025-01-13T10:01:32.821Z cpu48:2097739)Fil3: 5033: Lock failed on file: syslog on vol 'XXXXXXXX' with FD: <FD c36 r0>
....

 

Resolution

The host that has an exclusive lock on the syslog volume has to be cleared. Please refer this article for more detail on how to clear the lock -  https://knowledge.broadcom.com/external/article/323140/device-or-resource-busy-file-is-locked-e.html