Scale up MONTo improve MON scalability, resources on the HCX Cloud Manager must be increased to:
Extended HCX Cloud Manager Resource Allocation:
8 vCPUs & 24 GB RAMThe maximum supported numbers for MON in a HCX Cloud Manager are:
HCX version 4.5:
- 1000 VMs with MON enabled at any given time
- 100 Network Extensions with MON enabled
- 200 concurrent MON operations:
- Enable or disable MON on a VM
- Bulk migrations to MON enabled segments
- Combination of both
HCX version 4.4 and earlier:
- 900 VMs with MON enabled at any given time
- 100 Network Extensions with MON enabled
- 100 concurrent MON operations:
- Enable or disable MON on a VM
- Bulk migrations to MON enabled segments
- Combination of both
Increase resources on the HCX Cloud ManagerThe following procedure must be used to increase resource allocation on the HCX Cloud Manager VM.
Requirements and Considerations before increasing resources on the HCX Cloud Manager
- Do NOT exceed recommended allocations as that may cause the HCX Cloud Manager to malfunction.
- Both HCX Cloud Manager and Connector must be running version HCX 4.3.2 or later.
- No specific vCenter or NSX version required but all compatibility restrictions still apply. Refer to HCX User Guide.
- There should be NO active migration or configuration workflows when making these resource changes.
- Changes must be made during a scheduled Maintenance Window.
- There is NO impact to Network Extension services.
- If MON is already enabled, events will NOT be detected while the HCX Cloud Manager remains out of service.
- There is NO need to increase the disk size.
- Increasing resources allocation on the HCX NE appliances will not have any effect on the scalability.
- In VMC environment, there is no prior reservation for vCPU resources for HCX Cloud Manager in MgmtResourcePool associated to a given SDDC. In that case, user is required to ensure that the extended vCPU resource should be available always in the MgmtResourcePool to sustain the MON scale up requirement for HCX Cloud Manager before and after the scale up.
Procedure
- Validation to check if HCX Cloud manager version is running 4.3.2 or higher.
Pre-Checks:
- Login to HCX Cloud manager admin CLI >> ccli >> list
admin@hcx [ ~ ]$ ccli
Welcome to HCX Central CLI
[admin@hcx] list
|-------------------------------------------------------------------|
| Node | Id | Address | State | Selected |
|-------------------------------------------------------------------|
| TEST-IX-R1 | 0 | 10.X.X.X:9443 | Connected | |
|-------------------------------------------------------------------|
| TEST-NE-R1 | 1 | 10.X.X.X:9443 | Connected | |
|-------------------------------------------------------------------|
- Check the below services running in the HCX cloud manager:
admin@hcx [ ~ ]$ systemctl --type=service | grep "zoo\|kaf\|web\|app\|postgres"
app-engine.service loaded active running App-Engine
appliance-management.service loaded active running Appliance Management
kafka.service loaded active running Kafka
postgresdb.service loaded active running PostgresDB
web-engine.service loaded active running WebEngine
zookeeper.service loaded active running Zookeeper
- Login to vCenter that hosts the HCX Cloud Manager
Note: If HCX Cloud Manager is already running with extended compute resources then no need for the re-execution of the compute resources, to avoid un-necessary reboot of the HCX Cloud Manager.
- Shutdown the VM's GuestOS using vCenter UI.
- Edit HCX Cloud Manager's VM to increase the vCPU and MEM reservations. Refer to:
- Power ON the HCX Cloud Manager VM.
- Access HCX Appliance Management Interface AUI (9443) to ensure all services are running
Post-Checks:
- Login to HCX Cloud manager admin CLI >> ccli >> list
admin@hcx [ ~ ]$ ccli
Welcome to HCX Central CLI
[admin@hcx] list
|-------------------------------------------------------------------|
| Node | Id | Address | State | Selected |
|-------------------------------------------------------------------|
| TEST-IX-R1 | 0 | 10.X.X.X:9443 | Connected | |
|-------------------------------------------------------------------|
| TEST-NE-R1 | 1 | 10.X.X.X:9443 | Connected | |
|-------------------------------------------------------------------|
- Check the below services running in the HCX cloud manager:
admin@hcx [ ~ ]$ systemctl --type=service | grep "zoo\|kaf\|web\|app\|postgres"
app-engine.service loaded active running App-Engine
appliance-management.service loaded active running Appliance Management
kafka.service loaded active running Kafka
postgresdb.service loaded active running PostgresDB
web-engine.service loaded active running WebEngine
zookeeper.service loaded active running Zookeeper
Important: In the event the HCX Cloud Manager fails to reboot OR any above listed services fail to start OR fleet appliances IX/NE/SDR got disconnected, revert the configuration changes immediately and ensure the system comes back on-line.
Recommendations operating MON at scale
- As a best practice, use vSphere Monitoring and Performance to monitor HCX Cloud Manager CPU utilization and MEM usage.
- Do NOT exceed the recommended limits as that could cause system instability and failed workflows.
- In a scaled up environment, when MON operations are being processed, expect for the CPU utilization to increase significantly during a short periods of time and there may be a temporary delay in the UI response.
- Limit the concurrency of MON operations when making configuration changes while having active Bulk migrations into MON enabled segments.
- The regular limit per HCX Manager applies for any other Network Extensions that do NOT have MON enabled.