Rubrik VM backup failing with NFC_COMPRESSION_ERROR and Internal error RBK92010002 (Vmware.UnableFetchDiskData)
search cancel

Rubrik VM backup failing with NFC_COMPRESSION_ERROR and Internal error RBK92010002 (Vmware.UnableFetchDiskData)

book

Article ID: 393216

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

The Rubrik VM backup solution is failing to successfully back up multiple virtual machines across different hosts. Rubrik support will report following errors , indicating that the backup process is unable to retrieve required disk data from the affected VMs.

Internal error RBK92010002 (Vmware.UnableFetchDiskData)

Rubrik logs :

NfcNewAuthdConnectionEx: Failed to connect: Error reading from vmware-authd socket. Reason: Resource temporarily unavailable NBD_ClientOpen: Couldn't connect to bcn-lx-esxxxx.xx.xxx.xxxxx.com:902 DescriptorOpenNbd: Failed to open NBD extent ... NBD_ERR_NETWORK_CONNECT Error 14009 (The server refused connection) Open VMDK failed, err = The server refused connection(err=14009)

VDDK logs :

Failed to create a connection with server bcn-lx-esxxxx.xx.xxx.xxxxx.com: Error reading from vmware-authd socket NBD_ClientOpen: Couldn't connect to bcn-lx-esxxxx.xx.xxx.xxxxx.com:902 DiskLib error 2338: NBD_ERR_NETWORK_CONNECT Error 14009 (The server refused connection)
 

NFC_COMPRESSION_ERROR.

Rubrik Logs :


VM W 0:11:32 2025-03-31 19:37:00.746 0:00:00 CREATE_VMWARE_SNAPSHOT_  :::0 [fetch snapshot] [NFC ERROR]NfcAio_TimedWait: The session is in a faulted state: NFC_COMPRESSION_ERROR
VM I 0:11:32 2025-03-31 19:37:00.746 0:00:00 CREATE_VMWARE_SNAPSHOT_  :::0 [fetch snapshot] VixDiskLib: VixDiskLib_Close: Close disk.
VM W 0:11:32 2025-03-31 19:37:00.746 0:00:00 CREATE_VMWARE_SNAPSHOT_ :::0 [fetch snapshot] [NFC ERROR]NfcAio_TimedWait: The session is in a faulted state: NFC_COMPRESSION_ERROR
VM W 0:11:32 2025-03-31 19:37:00.746 0:00:00 CREATE_VMWARE_SNAPSHOT_ :::0 [fetch snapshot] [NFC ERROR]NfcAio_DDBGet: The session is in a faulted state: NFC_COMPRESSION_ERROR

Hostd Log:


warning hostd[2100694] [Originator@6876 sub=Libs opID=nbdmode-0000009b67eb11a0] [NFC ERROR]NfcAioGetMessage: Srv invalid msg hdr magic # 4, was expecting -1593779590
warning hostd[2100694] [Originator@6876 sub=Libs opID=nbdmode-0000009b67eb11a0] [NFC ERROR]NfcAioLogFatalSessionErrorLocked: A fatal session error occurred. The error was: 'NFC_SESSION_ERROR' (8)
warning hostd[2100694] [Originator@6876 sub=Libs opID=nbdmode-0000009b67eb11a0] [NFC ERROR]NfcAioGetAndProcessMsg: Failed to receive an AIO message: NFC_SESSION_ERROR
warning hostd[2100694] [Originator@6876 sub=Libs opID=nbdmode-0000009b67eb11a0] [NFC ERROR]NfcAioServerProcessMain: Fatal session error. Cleaning up AIO session
error hostd[2100694] [Originator@6876 sub=Nfcsvc opID=nbdmode-0000009b67eb11a0] Read error from the nfcLib: NFC_SESSION_ERROR (done = yep)

Environment

ESXi 7.x and above

Rubrik Back up Solution.

Cause

You might notice hosts showing a large number of read error on  vmhba1

Might be a Bad SFP in the MDS switch prevented communication with storage array.

Resolution

Engage backup vendor.

Additional Information

Common Causes of NFC Compression Errors

1. Connectivity Issues:

  • DNS Resolution Problems: The Veeam proxy or backup server may be unable to resolve the IP address of the ESXi host.
  • Firewall/Port Issues: Port 902, used for NFC, might be blocked by a firewall or network device.
  • General Network Connectivity Problems: Network issues between the vCenter and ESXi hosts can disrupt NFC communication.

2. Permissions Issues:

  • Insufficient Permissions: The account used for the Veeam backup infrastructure might not have the required permissions to access virtual machines or datastores.

3. File Locks:

  • Locked Files: The file that Veeam is attempting to read or write may be locked by another process or VM within the vSphere environment.

4. NFC Memory Limits:

  • Host Memory Exhaustion: The ESXi host may be running low on memory for NFC sessions, often due to excessive retries or stale sessions.

5. VDDK Issues:

  • VDDK Crashes: In certain versions of vSphere, VDDK (Virtual Disk Development Kit) crashes may occur after encountering NBD (Network Block Device) asynchronous I/O (AIO) errors, which lead to NFC issues.

6. NBD Transport Issues:

  • Network Buffer Size: In older versions of vSphere, larger buffer sizes on the VDDK side could result in increased memory consumption on the NFC server side, causing errors.

7. VM Configuration Errors:

  • Missing Parent VM Configuration: If the VM configuration doesn’t specify the parent VM or parentVApp, backups using NBD transport might fail.

 

Troubleshooting and Solutions

1. Check Connectivity:

  • Ping/Resolve: Ensure that the Veeam proxy or backup server can successfully ping and resolve the IP address of the ESXi host.
  • Firewall/Port Check: Verify that port 902 is not blocked by firewalls or network devices, as this port is critical for NFC communication.

2. Review Permissions:

  • Account Permissions: Confirm that the account used in Veeam for vCenter has appropriate permissions to access virtual machines and datastores.
  • Grant Necessary Permissions: If the permissions are insufficient, ensure that the backup user has at least Read-Only access to the relevant objects in vCenter.

3. Resolve File Locks:

  • Identify Locked Files: Check if any files are locked within the vSphere environment that may be preventing the backup process. If so, unlock these files.
  • Monitor VM Processes: Ensure that no other processes, such as another backup or VM task, are locking the virtual machine's files.

4. Increase NFC Memory:

  • Modify Memory Limits: If NFC memory exhaustion is the cause, increase the memory available to NFC sessions. Edit the config.xml file under the nfcsvc section:
    • Path: /etc/vmware/hostd/config.xml
    • Example:

      xml
      CopyEdit
      <maxMemory>60*1024*1024</maxMemory>
  • Restart Services: After modifying the configuration file, restart the hostd service to apply the changes.

5. Restart Management Services:

  • Restart the management services on the ESXi host to ensure that the NFC service is functioning properly.
  • Use the following command to restart services:

    bash
    CopyEdit
    /etc/init.d/hostd restart

6. Reboot Hosts/vCenter:

  • Reboot ESXi Hosts: If the error persists after troubleshooting, consider rebooting the ESXi host to reset any stale NFC sessions.
  • Reboot vCenter: Rebooting vCenter can also help if the issue appears to be related to vCenter connectivity or service state.

7. Check VM Configuration:

  • Verify Parent VM: Ensure that the VM’s configuration in vCenter is correctly linked to the parent VM or parentVApp, especially if you are using NBD transport.
  • Move VM Out of vCLS Folder: If the VM is located in a vCLS folder, try moving it to a standard folder in vCenter. Backup of VMs in a vCLS folder may fail by default.

8. Check for VDDK Issues:

  • VDDK Version: Ensure that the Virtual Disk Development Kit (VDDK) is compatible with your version of vSphere and that the latest patches are installed.
  • Update VDDK Drivers: If necessary, update the VDDK drivers on the backup server.

9. Monitor Network Performance:

  • Network Errors: Monitor the network performance for errors, packet drops, or slowdowns that could impact the NFC protocol during backups.

 

 

 Troubleshooting NFS datastore connectivity issues