HA/FDM fails to restart a virtual machine with the error: Failed to open file /vmfs/volumes/UUID/.dvsData/ID/100 Status (bad0003)= Not found after a storage migration
search cancel

HA/FDM fails to restart a virtual machine with the error: Failed to open file /vmfs/volumes/UUID/.dvsData/ID/100 Status (bad0003)= Not found after a storage migration

book

Article ID: 302005

calendar_today

Updated On:

Products

VMware

Issue/Introduction

This article describes a specific issue. If you experience all of the above symptoms, consult the sections below. If you experience some, but not all, of these symptoms, your issue is not related to this article. Search the Knowledge Base for your symptoms or Open a Support Request.


Symptoms:
  • A virtual machine remains powered off after an HA event
  • A storage migration of the affected virtual machine has been performed using Storage vMotion
  • The virtual machine is powered on, but is not connected to DVS
  • You are able to manually start the virtual machine
  • You may be using a virtual distributed switch and have Storage DRS enabled
  • HA/FDM fails to restart a virtual machine with the error:
  • vSphere Client shows the error:
Failed to open file /vmfs/volumes/UUID/.dvsData/ID/100 Status (bad0003)= Not found 
  • The Summary tab of the ESXi host shows the error:
vSphere HA agent for this host has an error: vSphere HA agent cannot be correctly installed or configured 
  • In the fdm.log file, you see entries similar to:
    YYYY-MM-DDTHH:MM:SS.MMMZ [FFE3BB90 error 'Execution' opID=host-6627:6-0] [FailoverAction::ReconfigureCompletionCallback]
    Failed to load Dv ports for /vmfs/volumes/UUID/VM/VM.vmx: N3Vim5Fault19PlatformConfigFault9ExceptionE(vim.fault.PlatformConfigFault)

    YYYY-MM-DDTHH:MM:SS.MMMZ [FFE3BB90 verbose 'Execution' opID=host-6627:6-0] [FailoverAction::ErrorHandler]
    Got fault while failing over vm. /vmfs/volumes/UUID/VM/VM.vmx: [N3Vim5Fault19PlatformConfigFaultE:0xecba148] (state = reconfiguring)


Cause

When a virtual machine is attached to a dvPortgroup, port information is stored in the virtual machine configuration file (.vmx) and on the VMFS volume on which the virtual machine is registered.
The VMFS volume has a sub-directory that contains the name of the port number used by the virtual machine. When you use Storage vMotion to move the virtual machine to a different datastore, this data may not be recreated on the destination datastore.
For HA to successfully restart the virtual machine and connect it to the dvPortgroup, the correct data must be present in the .dvsData folder on the VMFS volume (that is, it must match the dvs.switchId entry in the .vmx file).

Resolution

This is a known issue in vCenter Server 5.0, vCenter Server 5.0 Update 1, ESXi 5.0, and ESXi 5.0 Update 1.
This issue is resolved with vCenter Server 5.0 Update 1b.

Notes:
  • Upgrading to vCenter Server 5.0 Update 1b prevents the issue from affecting virtual machines in future.
  • If there are virtual machines that are already affected, you need to apply the below workarounds to identify and remediate these virtual machines.
  • If you have applied the hot patch, engage VMware Support before applying the vCenter Server 5.0 Update 1b.

To work around this issue, perform one of these options:

Manually identifying and remediating affected virtual machines

To manually identify and remediate affected virtual machines:
  1. When a virtual machine is using a vDS, its .vmx file has an entry similar to:

    ethernet0.dvs.switchId = "8d 7e 0b 50 67 e4 0b 46-3f 30 0b f8 d5 6a 58 f3"

    This information is also stored in .dvsData , a hidden folder in the base or root of the VMFS volume where the virtual machine is registered. Run the command:

    ls -la /vmfs/volumes/mydatastore/.dvsData

    drwxr-xr-x 1 root root 700 Feb 28 14:30 .
    drwxr-xr-t 1 root root 14280 May 10 10:56 ..
    drwxr-xr-x 1 root root 840 May 10 11:01 8d 7e 0b 50 67 e4 0b 46-3f 30 0b f8 d5 6a 58 f3


    If the directory that correlates with the .vmx file entry is not present on the same datastore, the virtual machine may be experiencing this issue.

  2. Connect the affected virtual machines to a different dvPortgroup temporarily, then reconnect them back to the original portgroup. When you reconnect them, the correct information is populated to the datastore.

Using an automated Perl script to identify and remediate affected virtual machines

The perl SDK script querySvMotionVDSIssue.pl , attached to this article, lists all affected virtual machines. The --fix true parameter remediates all affected virtual machines.
Note: To use the script, you need a system with vSphere CLI 5.x installed. Alternatively, you can use the vMA 5.x appliance.

To run the script:

  1. Download 2013639_script.zip (attached to this article) and unzip querySvMotionVDSIssue.pl .
  2. To list affected virtual machines, run:

    ./querySvMotionVDSIssue.pl --server vc-server-address --username vc-admin-user

    Where:
    • server is the vCenter Server
    • username is the vCenter Server admin user name

  3. To remediate all affected virtual machines, run:

    ./querySvMotionVDSIssue.pl --server vc-server-address --username vc-admin-user --fix true

    Where:
    • server is the vCenter Server
    • username is the vCenter Server admin user name

Note: The optional parameter --fix can be set to true or false. Set the parameter to true to remediate the affected virtual machines.

Using an automated PowerShell script to identify and remediate affected virtual machines

As an alternative to the perl SDK script, you can use a PowerShell script to list and fix all the affected virtual machines.

Note: To use the script, you need a system with vSphere PowerCLI 5.x installed.
To run the script:
  1. Download 2013639_PowerShell_script.zip (attached to this article) and unzip CheckForDVSIssueWithNoVDSSnapin.ps1 .
  2. Launch vSphere PowerCLI 5.x.
  3. To manually import the function Test-VMVDSIssue defined in the PowerShell script, run the command:

    C:\vSphere PowerCLI> Import-Module <full-path-to-script.ps1>

    For example:

    C:\vSphere PowerCLI> Import-Module C:\scripts\Check for DVS Issue with no VDS Snapin.ps1

  4. To connect to vCenter Server, run the command:

    C:\vSphere PowerCLI> connect-VIserver -Server xxx.xxx.xxx.xxx -User user -Pass password

  5. To identify for affected virtual machines, run the command:

    C:\vSphere PowerCLI> Get-VM | Sort Name | Test-VDSVMIssue

    Note: Affected virtual machines are highlighted in red.

  6. To remediate any impacted virtual machines, run the command:

    C:\vSphere PowerCLI> Get-VM | Sort Name | Test-VDSVMIssue -Fix

    You see output similar to:

    Problem found with VM1 Network adapter 1
    Fixing issue...
    ..Finding free port on ESX
    ..Moving Network adapter 1 to another free port on ESX
    ..Moving Network adapter 1 back to port 1443
    ..Checking changes were completed
    VM1 Network adapter 1 is now fixed and OK


Additional Information

To be alerted when this document is updated, click the Subscribe to Article link in the Actions box

Attachments

2013639_Perl_script.zip get_app
2013639_PowerShell_script.zip get_app