vSphere Replication takes 24 hours to start after removing a large number of hosts from the environment
book
Article ID: 301325
calendar_today
Updated On:
Products
VMware Live Recovery
VMware vSphere ESXi
Show More
Show Less
Issue/Introduction
Symptoms: The hbr service is not starting on port 8123 for 24 hours after starting the vSphere Replication appliance.
Environment
VMware vSphere Replication 6.0.x VMware vSphere Replication 6.1.x VMware vSphere Replication 5.6.x VMware vSphere Replication 5.1.x VMware vSphere Replication 6.x VMware vSphere Replication 5.8.x VMware vSphere Replication 5.5.x VMware vSphere Replication 6.0 Beta VMware vSphere Replication 6.5.x VMware vSphere Replication 5.x
Cause
vSphere replication stores a list of host ip's and connects with each one with extremely persistent approach. In some cases, where over 100 hosts are changed and are no longer visible, the hbr service will take 24 hours to start.
Resolution
To resolve this issue:
Stop the hbr service by running this command:service hbrsrv stop
Take a backup of the database by running this command:cp /etc/vmware/hbrsrv.54.db /etc/vmware/hbrsrv.54.db.bak
Run this query for IP's:sqlite3 /etc/vmware/hbrsrv.54.db 'SELECT addresses FROM HostInfo' > /tmp/addresses.txt
Open the /tmp/addresses.txt file using a text editor.
Run this command to remove the comma::%s;,;\r;g
Run this command to determine what pings and what does not:nohup cat /tmp/addresses.txt | xargs -n1 ping -c 1 > /tmp/pings.txt
Run this command to determine bad IP's:cat /tmp/pings.txt | grep -B 3 "100%" | grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}' | uniq > /tmp/badips.txt
Run this command to determine good IP's:cat /tmp/pings.txt | grep -B 3 " 0%" | grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}' | uniq > /tmp/goodips.txt
Run this command to test good IP's:cat /tmp/goodips.txt | xargs -n1 ping -c 1
Run this command to test bad IP's:cat /tmp/badips.txt | xargs -n1 ping -c 1
Run this command to known the number of bad IP's:cat /tmp/badips.txt | wc
Run this command to prepare your SQL statement:cp /tmp/badips.txt /tmp/sqlstatement.txt
Edit the statement:vi /tmp/sqlstatement.txt
Run this command to add quotes and percentage around IP's::%s/^\(.*\)$/"%\1%"'/
Add SELECT statement at beginning of each line: For example::%s!^!sqlite3 /etc/vmware/hbrsrv.54.db 'SELECT addresses FROM HostInfo WHERE addresses LIKE !
Save file::wq
Make SQL statement executable:chmod +x /tmp/sqlstatement.txt
Run this query:/tmp/sqlstatement.txt Should return all the fields with bad IP's similar to step #1.
Copy to new file by running this command:cp /tmp/sqlstatement.txt /tmp/deletesql.txt
Edit the new file using this command:vi /tmp/deletesql.txt
Replace SELECT with DELETE::%s;SELECT addresses;DELETE FROM;g
Save file::wq
Execute the deletesql.txt file:chmod +x /tmp/deletesql.txt
EXECUTE:/tmp/deletesql.txt
VERIFY:/tmp/sqlstatement.txt Note : The output should be no results.
Start the hbr service.
Feedback
thumb_up
Yes
thumb_down
No