NSX host preparation failing - Timed out waiting for host to join NSX Manager.
search cancel

NSX host preparation failing - Timed out waiting for host to join NSX Manager.

book

Article ID: 388847

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Host preparation fails at 48% after waiting for a while. Failed to install software on host. Operation timed out. The logs suggest an attempt to lock the syslog file in exclusive mode is failing as another host has a lock on the file in exclusive mode already.

  • In the Log /etc/vmsyslog.conf

    [vmsyslog]
    logdir_unique=true

  • The /var/log/vmkernel.log suggests an attempt to lock the syslog file in exclusive mode failed because it was already exclusively locked by a process or application on another ESXi host.

  • The owner process or application that had this file opened in exclusive mode was running on an ESXi server that had an NIC with MAC address ##:##:##:11:01:0e attached to it.:

    ...
    YYYY-MM-MMTHH:MM:SS.###Z cpu46:####739)DLX: 4579: vol 'XXXXXXXX', lock at 91619328: [Req mode 1] Checking liveness:
    YYYY-MM-MMTHH:MM:SS.###Z cpu46:####739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
    gen 29, mode 1, owner XXXXXXXX mtime 418851
    num 0 gblnum 0 gblgen 0 gblbrk 0]
    YYYY-MM-MMTHH:MM:SS.###Z cpu46:####739)WARNING: DLX: 4524: Sleep and recheck lock completes with POLL_NO_LOCK_CHANGE_WAIT on XXXXXXXX, no lock change has happened, status: Wait interrupted
    YYYY-MM-MMTHH:MM:SS.###Z cpu46:####739)DLX: 4579: vol 'XXXXXXXX', lock at 91619328: [Req mode 1] Checking liveness:
    YYYY-MM-MMTHH:MM:SS.###Z cpu46:####739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
    gen 29, mode 1, owner XXXXXXXX mtime 418851
    num 0 gblnum 0 gblgen 0 gblbrk 0]
    YYYY-MM-MMTHH:MM:SS.###Z cpu46:####739)WARNING: DLX: 4524: Sleep and recheck lock completes with POLL_NO_LOCK_CHANGE_WAIT on XXXXXXXX, no lock change has happened, status: Wait interrupted
    YYYY-MM-MMTHH:MM:SS.###Z cpu46:####739)DLX: 4579: vol 'XXXXXXXX', lock at 91619328: [Req mode 1] Checking liveness:
    YYYY-MM-MMTHH:MM:SS.###Z cpu46:####739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
    gen 29, mode 1, owner XXXXXXXX mtime 418851
    num 0 gblnum 0 gblgen 0 gblbrk 0]
    YYYY-MM-MMTHH:MM:SS.###Z cpu48:####739)DLX: 5244: vol 'XXXXXXXX', lock at 91619328: Lock type: 10C00001. [Req mode: 1] Not free, pollStat 3:
    YYYY-MM-MMTHH:MM:SS.###Z cpu48:####739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
    gen 29, mode 1, owner XXXXXXXX mtime 418851
    num 0 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0
    YYYY-MM-MMTHH:MM:SS.###Z cpu48:####739)DLX: 2650: vol 'XXXXXXXX', lock at 91619328: Lock type: 10C00001. Exclusive Lock(s) held on a file on volume XXXXXXXX. numHolders:0 gblNumHolders:0, volume state 10, APD s$
    YYYY-MM-MMTHH:MM:SS.###Z cpu48:####739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
    gen 29, mode 1, owner XXXXXXXX mtime 418851
    num 0 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0
    YYYY-MM-MMTHH:MM:SS.###Z cpu48:####739)DLX: 2651: vol 'XXXXXXXX', lock at 91619328: Lock type: 10C00001. owner(s) MAC: XXXXXXXX:
    YYYY-MM-MMTHH:MM:SS.###Z cpu48:####739)[type 10c00001 offset 91619328 v 204556, hb offset 3768320
    gen 29, mode 1, owner XXXXXXXX mtime 418851
    num 0 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0
    YYYY-MM-MMTHH:MM:SS.###Z cpu48:####739)Fil3: 5033: Lock failed on file: syslog on vol 'XXXXXXXX' with FD: <FD c36 r0>
    ....

 

Environment

VMware NSX 

Cause

If logdir_unique=true (in /etc/vmsyslog.conf) vmsyslogd will try to create a directory with the FQDN of the host as name inside the syslog directory. In this case it is getting denied because the syslog volume is already locked by another host.

 

Resolution

The host that has an exclusive lock on the syslog volume has to be cleared. Please refer this article for more detail on how to clear the lock -  Device or resource busy/ file is locked error when making changes to a file on an ESXi datastore

Additional Information

If this kb did not help resolve your issue, you can review the following kb for further troubleshooting steps: Troubleshooting NSX Installation Operations