NVMEOver TCP session to the storage lost after a reboot of the esxi hosts configured with Hostprofile
search cancel

NVMEOver TCP session to the storage lost after a reboot of the esxi hosts configured with Hostprofile

book

Article ID: 393229

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • ESXi hosts lose access to the NVMe storage upon reboot after an initial successfull connection using NVMe Over TCP. 
  • ESXi hosts are attached with host profiles. 
  • Connectivity tests using vmkping to the controller IPs is successfull. 
  • Connectivity tests using vmkping and jumbo frames (if configured) to the controller IPs is successfull. 
  • The boot logs or the vmkernel logs during the boot shows connectivity successfull initially and then reports keepalive timeouts the next minute,followed by marking the paths as dead. 

nvmetcp:nt_ConnectController:873 [ctlr 271] connected successfully

nvmetcp:nt_ConnectController:873 [ctlr 266] connected successfully

..........

WARNING: NVMEDEV:7663 Last keep alive command timeout.

NVMEDEV:9483 Request to start controller 273 recovery

  • vobd logs shows the ruleset for nvmetcp being added with default enabled but later shows configuration changed with ruleset disabled due to hostprofile configuration.

[vob.net.firewall.config.changed] Firewall configuration has changed. Operation 'add' for rule set nvmetcp succeeded.
[esx.audit.net.firewall.config.changed] Firewall configuration has changed. Operation 'add' for rule set nvmetcp succeeded.

[vob.net.firewall.config.changed] Firewall configuration has changed. Operation 'enable' for rule set nvmetcp succeeded.

[vob.net.firewall.config.changed] Firewall configuration has changed. Operation 'disable' for rule set nvmetcp succeeded.
[esx.audit.net.firewall.config.changed] Firewall configuration has changed. Operation 'disable' for rule set nvmetcp succeeded.

Environment

VMware vSphere ESXi 7.x 

VMware vSphere ESXi 8.x

Cause

This scenario is only applicable if the esxi hosts are configured to use host profile. Firewall ruleset for nvmetcp was not enabled under the firewall configuration in hostprofile attached to the esxi causing the traffic to be dropped for the service.

Resolution

Ensure there is no connectivity issues between the esxi hosts and the nvme storage. Refer KB344313 to check vmkernel port network connectivity. 

  1. Edit the hostprofile attached to the esxi host 
  2. Navigate to Settings > Security and Sevices > Firewall Configuration > Ruleset Configuration
  3. Locate nvmetcp and ensure "Flag indicating whether the ruleset should be enabled" is checked for nvmetcp.