The following configuration guide is provided to increase the scale of concurrent Bulk/RAV migrations per HCX Manager beyond the default value for HCX Manager systems running HCX software version 4.7 to 4.9. For HCX Manager systems running HCX version 4.10 or higher the configuration guide procedure is documented in Knowledge article Refer HCX - Bulk Migration & Replication Assisted vMotion (RAV) scalability guide - for HCX version 4.10 or higherBaseline Migration Concurrency:Supported 300 Migrations
(Bulk & RAV) per HCX Manager
vCPU |
RAM (GB) |
Disk Size (GB) |
Tuning |
4 |
12 |
64 |
N/A |
Extended Migration Concurrency :Supported 600 Migrations
(Bulk & RAV) per HCX Manager
vCPU |
RAM (GB) |
Disk Size (GB) |
Tuning |
32 |
48 |
300 |
Y |
Increase resources on the HCX Connector/Cloud ManagerThe following procedure must be used to increase resource allocation on HCX Connector & Cloud Manager VM both.
Requirements and Considerations before increasing resources on the HCX Connector & Cloud Manager
- Do NOT exceed recommended allocations as that may cause the HCX Connector/Cloud Manager to malfunction.
- Both HCX Cloud Manager and Connector must be running version HCX 4.7.0 or later.
- There should be NO active migration or configuration workflows when making these resource changes.
- Changes must be made during a scheduled Maintenance Window.
- There is NO impact to Network Extension services.
- There is NO change of concurrency for HCX vMotion/Cold Migration workflow.
- The concurrent migration limit specified for HCX Replicated Assisted vMotion (RAV) is ONLY for Initial & Delta sync. During RAV switchover stage, only one relocation will be serviced at a time on a serial basis.
- Additional service meshes/IX appliance should be deployed for unique workload clusters to aggregate the replication capacity of multiple IX appliances. A different Services Mesh can be deployed for each workload cluster at source and/or target.
- If there are multiple service meshes/IX Appliances then RAV can switchover in parallel, however per SM/IX Pair it will always be sequential.
Procedure
IMPORTANT: It is recommended to take snapshots for HCX Connector & Cloud Manager VMs prior to executing steps.
Step 1: Increase the vCPU and memory of HCX Manager to 32 and 48GB respectively.
- Login to vCenter that hosts the HCX Manager.
- Shutdown HCX Manager VM's GuestOS using vCenter UI.
- Edit HCX Manager's VM to increase the vCPU and MEM reservations. Refer to:
- Power ON the HCX Manager VM.
Step 2: Add a 300GB disk to HCX Connector & Cloud Manager.
IMPORTANT: Following steps can be used to add a 300GB disk to both HCX managers.
Refer Creating a new virtual disk for an existing Linux virtual machine
- Mount the created disk to HCX managers.
mount /dev/sdc1 /common_ext
df -hT
# Check if /common_ext has been mounted and has the correct type
- Add an entry to "/etc/fstab" to ensure mounted disk will sustain a reboot and HCX Manager upgrade.
vi /etc/fstab
/dev/sdc1 /common_ext ext3 rw,nosuid,nodev,exec,auto,nouser,async 1 2
Note: Use Linux VI editor to edit/modify the file.
1. Press the ESC key for normal mode.
2. Press "i" Key for insert mode.
3. Press ":q!" keys to exit from the editor without saving a file.
4. Press ":wq!" keys to save the updated file and exit from the editor.
Step 3: Stop HCX services as below:
# systemctl stop postgresdb
# systemctl stop zookeeper
# systemctl stop kafka
# systemctl stop app-engine
# systemctl stop web-engine
# systemctl stop appliance-management
Step 4: Redirect existing contents under
"kafka-db" and
"postgres-db" to the newly created disk.
- Move directory "/common/kafka-db" to "/common/kafka-db.bak".
cd /common
mv kafka-db kafka-db.bak
- Create a new directory "/common_ext/kafka-db".
cd /common_ext
mkdir kafka-db
Note: The contents inside Kafka doesn't require to be copied and will be generated after kafka/app-engine services restart.
- Change the ownership and permissions of this directory same as "/common/kafka-db.bak".
chmod 755 kafka-db
chown kafka:kafka kafka-db
- Make a soft link from "/common/kafka-db" to "/common_ext/kafka-db".
cd /common
ln -s /common_ext/kafka-db kafka-db
- Move directory "/common/postgres-db" to "/common/postgres-db.bak" as a backup
cd /common
mv postgres-db postgres-db.bak
- Copy the content for directory "/common/postgres-db.bak" to "/common_ext/postgres-db" and change the ownership to postgres.
Note: Use
"-R" option to change the ownership for
"/common_ext/postgres-db" as below:
cp -r /common/postgres-db.bak /common_ext/postgres-db
chown -R postgres:postgres /common_ext/postgres-db
- Make a soft link from "/common/postgres-db" to "/common_ext/postgres-db".
cd /common
ln -s /common_ext/postgres-db postgres-db
Step 5: Start HCX services as below:
# systemctl start postgresdb
# systemctl start zookeeper
# systemctl start kafka
# systemctl start app-engine
# systemctl start web-engine
# systemctl start appliance-management
Performance Tuning on the HCX ManagerIn addition to increasing HCX resources, you must perform the following tuning steps to scale concurrent migrations.
IMPORTANT: The steps performed in this procedure are not persisted after an HCX Manager upgrade.
ProcedureStep 6: Stop HCX services again.
Login to HCX Connector/Cloud Manager Root Console
# systemctl stop postgresdb
# systemctl stop zookeeper
# systemctl stop kafka
# systemctl stop app-engine
# systemctl stop web-engine
# systemctl stop appliance-management
Step 7: Increase memory page in app-engine framework.
- Edit "app-engine-start" file to increase JAVA memory allocation and max perm size.
vi /etc/systemd/app-engine-start
JAVA_OPTS="-Xmx4096m -Xms4096m -XX:MaxPermSize=1024m ...
Step 8: Increase thread pooling for Mobility Migration services.
- Edit "MobilityMigrationService.zql" and "MobilityTransferService.zql" to increase thread numbers.
vi /opt/vmware/deploy/zookeeper/MobilityMigrationService.zql
"numberOfThreads": "50",
vi /opt/vmware/deploy/zookeeper/MobilityTransferService.zql
"numberOfThreads":50,
Step 9: Increase message size limit for kafka framework.
- Edit "vchsApplication.zql" and update "kafkaMaxMessageSizeBytes" from "2097152" to "4194304".
vi /opt/vmware/deploy/zookeeper/vchsApplication.zql
"kafkaMaxMessageSizeBytes":4194304
- Edit "kafka server.properties" and update "message.max.bytes" from "2097152" to "4194304".
vi /etc/kafka/server.properties
message.max.bytes=4194304
Step 10: Start HCX services.
# systemctl start postgresdb
# systemctl start zookeeper
# systemctl start kafka
# systemctl start app-engine
# systemctl start web-engine
# systemctl start appliance-management
Step 11: Check the below services running in the HCX Connector/Cloud Manager:
admin@hcx [ ~ ]$ systemctl --type=service | grep "zoo\|kaf\|web\|app\|postgres"
app-engine.service loaded active running App-Engine
appliance-management.service loaded active running Appliance Management
kafka.service loaded active running Kafka
postgresdb.service loaded active running PostgresDB
web-engine.service loaded active running WebEngine
zookeeper.service loaded active running Zookeeper
IMPORTANT: In the event the HCX Manager fails to reboot OR any above listed services fail to start, revert the configuration changes immediately and ensure the system comes back on-line. Additionally, Snapshots can also be used to revert the above configurations incase of any failure while applying the steps.
Note: Snapshot revert process won't restore HCX Connector/Cloud Manager's compute resources vCPU/MEM. User must follow
"Step 1" to restore vCPU and memory of HCX Manager to
"8" and
"12GB" respectively, if needed.