vSphere Replication (Live Recovery) 9.0
vSphere Replication 8.0
The Security Profile is associated with network port numbers 31031 and 32032. Earlier editions of vSphere Replication used 44046 which is obsolete in vSphere Replication 8.x versions. Port 902 is ESXi host related to the NFC service.
Two specific distinctions to make with vSphere Replication. The vSphere Replication Appliance and 9 add-on vSphere Replication Servers configuration limits.
The vSphere Replication Appliance is the primary server and the vSphere Replication Servers are the secondary servers that communicate to the primary server over port 8123.
The Hosts hbr-agent service is enabled when vSphere Replication is configured to vCenter. The hbr service communicates with the vCenter's inventory and populates the vSphere Replication hbrsrv database of all ESXi hosts managed by vCenter.
Get the Vmid of the virtual machine
[root@esxi:~] vim-cmd vmsvc/getallvmsVmid Name File Guest OS Version 11 vmname [datastore] vmname/vmname.vmx OS_64Guest vmx-21
These are the tools related to vSphere Replication
[root@esxi:~] vim-cmd hbrsvc/Commands available under hbrsvc/:vmreplica.abort vmreplica.pausevmreplica.create vmreplica.queryReplicationStatevmreplica.disable vmreplica.reconfigvmreplica.diskDisable vmreplica.resumevmreplica.diskEnable vmreplica.startOfflineInstancevmreplica.enable vmreplica.stopOfflineInstancevmreplica.getConfig vmreplica.syncvmreplica.getState
Get the destination vSphere Replication Appliance or vSphere Replication Server (add-on) that the virtual machine is replicating to
[root@esxi:~] vim-cmd hbrsvc/vmreplica.getConfig 11Retrieve VM replication configuration: The VM is configured for replication with the following options: VM Replication ID = GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx Destination IP Address = x.x.x.x Destination Port = 31031 Recovery Point Objective = 1440 Quiesce Guest OS = false Enable Opportunistic Updates = false Network Compression = false Network Encryption = false Paused for Replication = false
Disk scsi0:0 is configured for replication: Device key = 2000 Replication ID = RDID-d27a4619-157e-414a-9803-427f822a4de5
To see if the virtual machine is replicating run a sync
[root@esxi:~] vim-cmd hbrsvc/vmreplica.sync 11
Force a replica synchronzation for the VM:
Get the state of the virtual machines replication in bytes of data transferring, run a few times to see progress.
[root@esxi:~] vim-cmd hbrsvc/vmreplica.getState 11Retrieve VM running replication state:The VM is configured for replication. Current replication state: Group: GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (generation=9627309314105)Group State: full sync (0% done: checksummed 0 bytes of 16 GB, transferred 0 bytes of 0 bytes)DiskID RDID-d27a4619-157e-414a-9803-427f822a4de5 State: full sync (checksummed 0 bytes of 16 GB, transferred 0 bytes of 0 bytes)
[root@esxi:~] vim-cmd hbrsvc/vmreplica.getState 11Retrieve VM running replication state:The VM is configured for replication. Current replication state: Group: GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (generation=9627309314105)Group State: full sync (60% done: checksummed 9.6 GB of 16 GB, transferred 80 KB of 392 KB)DiskID RDID-d27a4619-157e-414a-9803-427f822a4de5 State: full sync (checksummed 9.6 GB of 16 GB, transferred 80 KB of 392 KB)
[root@esxi:/var/run/log] cat hostd.log |less
yyyy-mm-ddThh:mm:ss.msZ info hostd[265382] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx-HMS-4293-61-1-f969 user=vpxuser:VSPHERE.LOCAL\Administrator] Event 575 : Sync started by System for virtual machine vmname on host esxi.domain.tld in cluster Cluster_name in ha-datacenter.
...
yyyy-mm-ddThh:mm:ss.msZ info hostd[264342] [Originator@6876 sub=Hbrsvc] Replication group (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last delta duration 1136 ms, size 826908 (file transfers duration: 126 ms, prepare delta duration: 0 ms)
yyyy-mm-ddThh:mm:ss.msZ info hostd[264342] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 576 : Sync completed for virtual machine vmname on host esxi.domain.tld in cluster Cluster_name in ha-datacenter (826908 bytes transferred).
yyyy-mm-ddThh:mm:ss.msZ info hostd[265393] [Originator@6876 sub=Hbrsvc] ReplicationScheduler: stats updated for (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last duration was 1s, bandwidth was 0.79 MB/s; estimated duration is now 1s, estimated bandwidth is 0.79 MB/s.
[root@esxi:/var/run/log] cat hostd.log |grep "last delta";date
yyyy-mm-ddThh:mm:ss.msZ info hostd[265381] [Originator@6876 sub=Hbrsvc] Replication group (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last delta duration 775 ms, size 0 (file transfers duration: 360 ms, prepare delta duration: 17 ms)
yyyy-mm-ddThh:mm:ss.msZinfo hostd[264342] [Originator@6876 sub=Hbrsvc] Replication group (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last delta duration 1136 ms, size 826908 (file transfers duration: 126 ms, prepare delta duration: 0 ms)
yyyy-mm-ddThh:mm:ss.msZ info hostd[265393] [Originator@6876 sub=Hbrsvc] Replication group (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last delta duration 1136 ms, size 826908 (file transfers duration: 126 ms, prepare delta duration: 0 ms)
yyyy-mm-ddThh:mm:ss.msZ info hostd[265381] [Originator@6876 sub=Hbrsvc] Replication group (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last delta duration 618 ms, size 0 (file transfers duration: 118 ms, prepare delta duration: 14 ms)
yyyy-mm-ddThh:mm:ss.msZ info hostd[265393] [Originator@6876 sub=Hbrsvc] Replication group (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last delta duration 618 ms, size 0 (file transfers duration: 118 ms, prepare delta duration: 14 ms)
[root@esxi:/vmfs/volumes/611a2ae5-aa26b7de-6322-00505601e86b/log] date
Day Month Day HH:MM:SS UTC YYYY
[root@esxi:/var/run/log] cat hostd.log |grep "last duration"
yyyy-mm-ddThh:mm:ss.msZ info hostd[265381] [Originator@6876 sub=Hbrsvc] ReplicationScheduler: stats updated for (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last duration was 1s, bandwidth was 0.00 MB/s; estimated duration is now 1s, estimated bandwidth is 1.00 MB/s.
yyyy-mm-ddThh:mm:ss.msZ info hostd[265393] [Originator@6876 sub=Hbrsvc] ReplicationScheduler: stats updated for (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last duration was 1s, bandwidth was 0.79 MB/s; estimated duration is now 1s, estimated bandwidth is 0.79 MB/s.
yyyy-mm-ddThh:mm:ss.msZ info hostd[265393] [Originator@6876 sub=Hbrsvc] ReplicationScheduler: stats updated for (groupID=GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx): last duration was 1s, bandwidth was 0.00 MB/s; estimated duration is now 1s, estimated bandwidth is 0.79 MB/s.
To track the vSphere Replication network. Replace the x.x.x.x with the your actual IP's in your environment. For this example the last octet is provided to understand direction of packets with capturing network data.
From the vCenter command line
Name: vc_fqdn.domain.tld
Address: x.x.x.4
From vCenter (IP x.x.x.4) to local vSphere Replication on port 8043root@vc [ ~ ]# curl -v telnet://x.x.x.5:8043* Rebuilt URL to: telnet://x.x.x.5:8043/* Trying x.x.x.5...* TCP_NODELAY set* Connected to x.x.x.5 (x.x.x.5) port 8043 (#0)From vCenter to remote vCenter over port 443root@vc [ ~ ]# curl -v telnet://x.x.x.2:443* Rebuilt URL to: telnet://x.x.x.2:443/* Trying x.x.x.2...* TCP_NODELAY set* Connected to x.x.x.2 (x.x.x.2) port 443 (#0)From Host to local vSphere Replication on port 80[root@Host:~] nc -z .x.x.x.5Connection to x.x.x.5 80 port [tcp/http] succeeded!From Host to remote vSphere Replication on port 31031[root@Host:~] nc -z x.x.x.3 31031Connection to x.x.x.3 31031 port [tcp/*] succeeded!From vSphere Replication to local Host on port 902root@vr [ ~ ]# curl -v telnet://x.x.x.6:902* Trying x.x.x.6:902...* Connected to x.x.x.6 (x.x.x.6) port 902 (#0)---From vSphere Replication to local vCenter on port 80 and 443root@vr [ ~ ]# curl -v telnet://x.x.x.4:80* Rebuilt URL to: telnet://x.x.x.4:80/* Trying x.x.x.4...* TCP_NODELAY set* Connected to x.x.x.4 (x.x.x.4) port 80 (#0)root@vr [ ~ ]# curl -v telnet://x.x.x.4:443* Rebuilt URL to: telnet://x.x.x.4:443/* Trying x.x.x.4...* TCP_NODELAY set* Connected to x.x.x.4 (x.x.x.4) port 443 (#0)---From vSphere Replication to remote vCenter on port 80 and 443root@wvr [ ~ ]# curl -v telnet://x.x.x.2:80* Trying x.x.x.2:80...* Connected to x.x.x.2 (x.x.x.2) port 80 (#0)root@vr [ ~ ]# curl -v telnet://x.x.x.2:443* Trying x.x.x.2:443...* Connected to x.x.x.2 (x.x.x.2) port 443 (#0)
To check for IP conflict on the vSphere Replication command line
root@vr [ ~ ]# ifconfig -a root@wvr [ ~ ]# nslookup <IP address>For the purpose of this information vSphere Replication Appliances are vr1 and vr2.
Isolating the Network Traffic of vSphere Replication KB 78613
Login to the vSphere Replication Appliance command line (putty session)
If you see 10-eth0.network. The ESXi host is using the default vmk0 to replicate data.
root@vr1 [ ~ ]# cd /etc/systemd/networkroot@vr1 [ /etc/systemd/network ]# ls -l -rw-r--r-- 1 root root 197 Jan 13 19:20 10-eth0.networkIf you see 10-eth1.network and 10-eth2.network. There is a dedicated replication configuration on the ESXi hosts.
root@vr1 [ /etc/systemd/network ]# ls -l-rw-r--r-- 1 root root 119 Jan 13 19:20 10-eth0.network -> Management-rw-r--r-- 1 root root 117 Jan 13 19:20 10-eth1.network -> VR Traffic-rw-r--r-- 1 root root 117 Jan 13 19:20 10-eth2.network -> VR NFC TrafficCheck the current arp/network tables. The ports 8043 are vSphere Replication point-to-point of the paired configuration between vr1 and vr2.
root@vr1 [ /etc/systemd/network ]# netstat –r |egrep -i "State|8043"Proto Recv-Q Send-Q Local Address Foreign Address Statetcp6 0 0 vr1.domain.tld:33040 vr2.domain.tld:8043 ESTABLISHEDtcp6 0 0 vr1.domain.tld:36796 vr1.domain.tld:8043 ESTABLISHEDtcp6 0 0 vr1.domain.tld:8043 srm1.domain.tld:50316 ESTABLISHEDtcp6 0 0 vr1.domain.tld:8043 vr1.domain.tld:36796 ESTABLISHEDtcp6 0 0 vr1.domain.tld:8043 vr2.domain.tld:46902 ESTABLISHEDOn the ESXi host . Identify and make a note of these parameters PortNum, ClientName, and tmp directory related to the replication information you discovered in the vCenter UI for the ESXi host.
Login to the ESXi Host where the vSphere Replicated VM resides to get the destination vSphere Replication IP and port number.run command: vim-cmd getallvms |grep vm_nameVmid Name File Guest OS Version 11 vmname [datastore] vmname/vmname.vmx OS_64Guest vmx-21use the vmid to run command syntax: vim-cmd hbrsvc/vmreplica.getConfig <vmid>ex: vim-cmd hbrsvc/vmreplica.getConfig 11Retrieve VM replication configuration: The VM is configured for replication with the following options: VM Replication ID = GID-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx Destination IP Address = x.x.x.x <--- vSphere Replication target IP, or paired vSphere Replication Destination Port = 31031 <--- vSphere Replication target IP port Recovery Point Objective = 1440 Quiesce Guest OS = true Enable Opportunistic Updates = false Network Compression = false Network Encryption = false Paused for Replication = false Disk scsi0:0 is configured for replication: Device key = 2000 Replication ID = RDID--d27a4619-157e-414a-9803-427f822a4de5
Login to the ESxi host where the vSphere Replication Appliance resides/running.[root@esxi:/tmp] which net-stats/bin/net-stats[root@esxi:/tmp]# net-stats -lPortNum Type SubType SwitchName MACAddress ClientName2214592523 4 0 vSwitch0 xx:xx:xx:xx:xx:xx vmnic0 <--- for default vmk0 the replication is on this uplinkHow to find the uplink the replication VM is using on the ESXi host[root@esxi:/tmp]# esxcli network vm listWorld ID Name Num Ports Networks-------- ---------- --------- --------265948 vr1 1 VM Network - Management[root@esxi:/tmp]# esxcli network vm port list -w 265948Port ID: 67108899vSwitch: vSwitch0Portgroup: VM Network - ManagementDVPort ID:MAC Address: xx:xx:xx:xx:xx:xxIP Address: x.x.x.xTeam Uplink: vmnic0 <---------- vsphere replication VM vr1 is using vmnic0Uplink Port ID: 2214592523Active Filters:
Using the pktcap-uw tool in ESXi 5.5 and later KB 2051814
The vmnic is the uplink and the vmk is the kernel port. The PortNum is the virtual switch port id for the uplink.
To capture packets run the pktcap-uw command at both sites simultaneously: you will need to edit the switch port id for the uplink and vmnic (221459252 and vmnic0) based on the customer's configuration found for replication.
[root@esxi:/tmp]# pktcap-uw --switchport 2214592523 -o /tmp/2214592523.pcap & pktcap-uw --uplink vmnic0 -o /tmp/vmnic0.pcap &or [root@esxi:/tmp]# pktcap-uw --trace --ip destination_ip > ip.pcap &or replace X with vmnic number[root@esxi:/tmp]# pktcap-uw --dir 2 --uplink vmnicX -o -| tcpdump-uw icmp -enr -
You can stop pktcap-uw tracing with the kill command:
kill $(lsof |grep pktcap-uw |awk '{print $1}'| sort -u)
Run this command to check that all pktcap-uw traces are stopped:
lsof |grep pktcap-uw |awk '{print $1}'| sort -u
To read the packet capture live or upload the pcap files and/or use wireshark (download | open pcap file | work with a pcap file).
[root@esxi:/tmp]# tcpdump-uw -ttttnnr 2214592523.pcap |grep 31031