No host can be used to access datastore path '[Target Datastore] VM_Name/VM_Name.vmdk'.VMware vSphere Replication
In the "/var/log/vmkernel.log" on source ESXi host -
2017-06-07T16:31:07.042Z cpu12:42983549)WARNING: Hbr: 2997: Command INIT_SESSION failed (result=Failed) (isFatal=FALSE) (Id=0) (GroupID=GID-0e2b094e-####-####-####-c51d387ef645)2017-06-07T16:31:07.042Z cpu12:42983549)WARNING: Hbr: 4521: Failed to establish connection to [x.x.x.x]:31031(groupID=GID-0e2b094e-####-####-####-c51d387ef645): Failure2017-06-07T16:32:37.061Z cpu12:42983549)Hbr: 2196: Wire compression supported by server x.x.x.x: FastLZ2017-06-07T16:32:37.074Z cpu16:42983549)Hbr: 2988: Command: INIT_SESSION: error result=Failed gen=-1: Error for (datastoreUUID: "5db8069b-########-####-20677ce134a4"), (diskId: "RDID-05e97824-####-####-####-44fb4e734abe"), (flags: on-disk-open): No accessible host for da$
Destination VR logs, in "/var/log/vmware/hbrsrv.log" -2017-06-07T16:50:05.224Z info hbrsrv[7F96E4409700] [Originator@6876 sub=Main opID=hs-24bdaa72] HbrError for (datastoreUUID: "5db8069b-########-####-20677ce134a4"), (diskId: "RDID-05e97824-####-####-####-44fb4e734abe") stack:2017-06-07T16:50:05.224Z info hbrsrv[7F96E4409700] [Originator@6876 sub=Main opID=hs-24bdaa72] [0] No accessible host for datastore 0fedfa9a-########2017-06-07T16:50:05.224Z info hbrsrv[7F96E4409700] [Originator@6876 sub=Main opID=hs-24bdaa72] [1] Code set to: Storage was not accessible.2017-06-07T16:50:05.224Z info hbrsrv[7F96E4409700] [Originator@6876 sub=Main opID=hs-24bdaa72] [2] Failed to find host to get disk type2017-06-07T16:50:05.224Z info hbrsrv[7F96E4409700] [Originator@6876 sub=Main opID=hs-24bdaa72] [3] While getting host capabilities for disk.2017-06-07T16:50:05.224Z info hbrsrv[7F96E4409700] [Originator@6876 sub=Main opID=hs-24bdaa72] [4] Refreshing disk usage.2017-06-07T16:50:05.224Z info hbrsrv[7F96E4409700] [Originator@6876 sub=Main opID=hs-24bdaa72] [5] Ignored error.2017-06-07T16:50:05.274Z verbose hbrsrv[7F96E710D700] [Originator@6876 sub=SessionManager] hbr.replica.Task.GetInfo: authorized2017-06-07T16:50:05.981Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Delta] ClientConnection (client=[x.x.x.x]:1139) allowing client with different minor version: Client 3 vs Server 52017-06-07T16:50:05.991Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Delta] Configured disks for group GID-0e2b094e-####-####-####-c51d387ef645:2017-06-07T16:50:05.991Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Delta] RDID-05e97824-####-####-####-44fb4e734abe2017-06-07T16:50:05.992Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Main] HbrError for (datastoreUUID: "5db8069b-########-####-20677ce134a4") stack:2017-06-07T16:50:05.992Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Main] [0] No accessible host for datastore 5db8069b-########-####-20677ce134a42017-06-07T16:50:05.992Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Main] [1] Code set to: Storage was not accessible.2017-06-07T16:50:05.992Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Main] [2] Failed to find host to get disk type2017-06-07T16:50:05.992Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Main] [3] Updating disk type for diskID=RDID-05e97824-####-####-####-44fb4e734abe2017-06-07T16:50:05.992Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Main] [4] Ignored error.2017-06-07T16:50:05.992Z info hbrsrv[7F96E7150760] [Originator@6876 sub=Setup] Created ActiveDisk for DiskID: RDID-05e97824-####-####-####-44fb4e734abe Base path: /vmfs/volumes/0fedfa9a-########/Main Content Svr/Main Content Svr.vmdk curPath: /vmfs/volumes/5db8069b-########-####-20677ce134a4/Main Content Svr/Main Content Svr.vmdk diskHostReq: (UNKNOWN)
Destination VR appliance is unable to connect with source host through port 902.
2017-06-07T17:34:27.912+08:00 info hbrsrv[15748] [Originator@6876 sub=Libs groupID= opID=hsl-60639d50] CnxOpenTCPSocket: Timed out connecting to server 10.#.#.134:902: Operation now in progressGID-0e2b094e-####-####-####-c51d387ef6452017-06-07T17:34:27.912+08:00 info hbrsrv[15748] [Originator@6876 sub=Libs groupID= opID=hsl-60639d50] CnxAuthdConnect: Returning false because CnxAuthdConnectTCP failedGID-0e2b094e-####-####-####-c51d387ef6452017-06-07T17:34:27.912+08:00 info hbrsrv[15748] [Originator@6876 sub=Libs groupID= opID=hsl-60639d50] CnxConnectAuthd: Returning false because CnxAuthdConnect failedGID-0e2b094e-####-####-####-c51d387ef6452017-06-07T17:34:27.912+08:00 info hbrsrv[15748] [Originator@6876 sub=Libs groupID= opID=hsl-60639d50] Cnx_Connect: Returning false because CnxConnectAuthd failedGID-0e2b094e-####-####-####-c51d387ef6452017-06-07T17:34:27.912+08:00 info hbrsrv[15748] [Originator@6876 sub=Libs groupID= opID=hsl-60639d50] Cnx_Connect: Error message: Failed to connect to server 10.#.#.134:902GID-0e2b094e-####-####-####-c51d387ef6452017-06-07T17:34:27.912+08:00 warning hbrsrv[15748] [Originator@6876 sub=Libs groupID= opID=hsl-60639d50] [NFC ERROR]NfcNewAuthdConnectionEx: Failed to connect: Failed to connect to server 10.#.#.134:902GID-0e2b094e-####-####-####-c51d387ef6452017-06-07T17:34:27.912+08:00 warning hbrsrv[15748] [Originator@6876 sub=Libs groupID= opID=hsl-60639d50] [NFC ERROR]NfcNewAuthdConnectionEx: Failed to connect to peer. Error: Failed to connect to server 10.#.#.134:902GID-0e2b094e-####-####-####-c51d387ef6452017-06-07T17:34:27.912+08:00 warning hbrsrv[15748] [Originator@6876 sub=Libs groupID= opID=hsl-60639d50] [NFC ERROR]NfcEstablishAuthCnxToServer: Failed to create new AuthD connection: Failed to connect to server 10.#.#.134:902GID-0e2b094e-####-####-####-c51d387ef6452017-06-07T17:34:27.912+08:00 warning hbrsrv[15748] [Originator@6876 sub=Libs groupID= opID=hsl-60639d50] [NFC ERROR]Nfc_BindAndEstablishAuthdCnx3: Failed to create a connection with server GID-0e2b094e-####-####-####-c51d387ef64510.#.#.134: Failed to connect to server 10.#.#.134:902
In the VR appliance, "/opt/vmware/hms/logs/hms.log", we see below entries -
2017-06-07T17:34:27.912 INFO hms.i18n.class com.vmware.hms.response.filter.I18nActivationResponseFilter [tcweb-15] (..response.filter.I18nActivationResponseFilter) [operationID=87751547-db0a-4c3a-8a41-d0bf8ab9894f-HMS-88032,sessionID=600955AE] | The localized message is: A replication error occurred at the vSphere Replication Server for replication <VM_Name>. Details: 'Error for (diskId: "RDID-05e97824-####-####-####-44fb4e734abe"), (hostIP: "10.#.#.134"), (flags: on-disk-open, retriable): Error connecting to host.; Set error flag: retriable; Failed to create NFC connection to 10.#.#.134, 902 via ip Any: Failed to connect to server 10.#.#.134:902; establish nfc connection on host-27; Tried operation 4 times, giving up.; Failed to open disk, couldn't create NFC session; Set error flag: on-disk-open; Tried operation 4 times, giving up.; Failed to open replica (/vmfs/volumes/5db8069b-########-####-20677ce134a4/VM_Name/hbrdisk.RDID-05e97824-####-####-####-44fb4e734abe.2717517.217894942880274.vmdk); Failed to open activeDisk (GroupID=GID-0e2b094e-####-####-####-c51d387ef645) (DiskID=RDID-05e97824-####-####-####-44fb4e734abe); Can't create replica state (GroupID=GID-0e2b094e-####-####-####-c51d387ef645) (DiskID=RDID-05e97824-####-####-####-44fb4e734abe); Cannot activate group. Loading disks from database (GroupID=GID-0e2b094e-####-####-####-c51d387ef645) ; Connecting to group GID-0e2b094e-####-####-####-c51d387ef645'.
NOTE: This error can be caused due to VR & VC compatibility issues.
Refer - https://interopmatrix.broadcom.com/Interoperability
Customers upgrading to VC 8.0 U1/U2, please make sure you check the interoperability matrix to ensure that your current vSphere replication appliances is compatible with vCenter and ESXi hosts.
1. Verify port # 80,443 & 902 is opened using curl commands. Some organizations block these ports as part of their security hardening rules. Check the ESXi firewall of the source and target ESXi hosts and add exceptions in "Allowed IP addresses". Exceptions must be added to the firewall rules of vSphere Web Client & vSphere Web Access in the format of X.X.X.0/24. This IP range belongs to the vSphere Replication appliance IP Address for Incoming Storage Traffic at the local site.
Example: curl -v telnet://ESXi-FQDN or IP:80; curl -v telnet://ESXi-FQDN or IP:902
NOTE: Port # 80 is not used by VR 8.8 for communication. Please refer to specific VR versions for troubleshooting. VMware vSphere Replication Security Guide
| Source | Target | Port | Protocol | Description |
|---|---|---|---|---|
| vSphere Replication appliance | Local vCenter Server | 80 | TCP | All management traffic to the local vCenter Server proxy system. vSphere Replication opens an SSL tunnel to connect to the vCenter Server services. |
| vSphere Replication server in the vSphere Replication appliance | Local ESXi host (intra-site) | 80 | HTTP | Traffic between the vSphere Replication server and the ESXi hosts on the same site. vSphere Replication opens an SSL tunnel to the ESXi services. |
| vSphere Replication server | ESXi host (intra-site only) on target site | 902 | TCP and UDP | Traffic between the vSphere Replication server and the ESXi hosts on the same site. Specifically, the traffic of the NFC service to the destination ESXi servers. |
2. Ensure that NTP on all the ESXi hosts, VRs and vCenters are in sync.
3. Check and disable IDS/IPS or other firewall packet filtering rules. A packet filtering firewall is a network security technique that regulates data flow to and from a network. Packet filters examine each TCP/IP packet, looking at the source and destination IP and port addresses. You can create rules that allow only known and established IP addresses while blocking all unknown or unknown IP addresses.
4. Ensure MTU is configured uniformly across all networking devices that support it between the sites including vSphere switches, ESXi hosts & vSphere Replication Appliance.
Refer - Testing VMkernel network connectivity with the vmkping command (1003728)
NOTE:
vSphere Replication by default uses a MTU (maximum transmission unit) of 1500. Achieving a MTU size of 1500 would be impossible on a WAN that uses VPN tunnels, IPsec encryption, overlay protocols & other firewalls that may be set at a different MTU size that doesn't match with the MTU set within the datacenter. Henceforth, the result of this VMKPING test may pass or fail but it shouldn't be considered as a direct indicator of this problem until you have explored all other possibilities. Try changing the MTU to a random size between 1500-9000 and check if you can communicate with the target VR.
Jumbo frames are network-layer PDUs (Protocol Data Unit) that have a size much larger than the typical 1500 bytes Ethernet MTU. Anything above the 1500 MTU is called a jumbo frame. Jumbo frames need to be configured to work on the ingress and egress interface of each device along the end-to-end transmission path. Furthermore, all devices in the topology must also agree on the maximum jumbo frame size. If there are devices along the transmission path that have varying frame sizes, then you can end up with fragmentation problems. Also, if a device along the path does not support jumbo frames and it receives one, it will drop it.
The benefits of jumbo frames can improve your network's performance. However, it is important to explore if and how your network devices support jumbo frames before you turn this feature on. Some of the biggest gains of using jumbo frames can be realized within and between data centers. But you should be cognizant of the fragmentation that may occur if those large frames try to cross a link that has a smaller MTU size.