HCX: Kafka service fails to start after increasing HCX Manager disk space on version 4.10 or higher
search cancel

HCX: Kafka service fails to start after increasing HCX Manager disk space on version 4.10 or higher

book

Article ID: 430264

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

VMware HCX Manager services, specifically the Kafka and app-engine services, fail to initialize following a manual disk expansion of the HCX Manager appliance.
Symptoms:

Kafka service remains in a 'failed' or 'stopped' state.

From an SSH session to the HCX Manager, the app-engine service will be stuck in activating state since kafka is not running:
admin@hcx-manager-hostname [ ~ ]$ systemctl status app-engine
 app-engine.service - App-Engine
     Loaded: loaded (/etc/systemd/system/app-engine.service; enabled; vendor preset: enabled)
     Active: activating (start-pre) since Thu <YYYY-MM-DD hh:mm:ss> UTC; 9min ago
Cntrl PID: 25616 (service-depende)
      Tasks: 2
     Memory: 420.0K
     CGroup: /system.slice/app-engine.service
             ├─ 9572 sleep 30
             └─25616 /bin/bash /etc/systemd/service-dependency-check.sh postgresdb database-upgrade zookeeper kafka

<MMM DD hh:mm:ss> hcx-manager-hostname service-dependency-check.sh[25616]: kafka is not running.
<MMM DD hh:mm:ss> hcx-manager-hostname service-dependency-check.sh[25616]: kafka is not running.
<MMM DD hh:mm:ss> hcx-manager-hostname service-dependency-check.sh[25616]: kafka is not running.
<MMM DD hh:mm:ss> hcx-manager-hostname service-dependency-check.sh[25616]: kafka is not running.
<MMM DD hh:mm:ss> hcx-manager-hostname service-dependency-check.sh[25616]: kafka is not running.
<MMM DD hh:mm:ss> hcx-manager-hostname service-dependency-check.sh[25616]: kafka is not running.

Logs may indicate IO errors or missing directory paths.

Filesystem inspection reveals the /common/kafka-db path is inconsistent with the expected post-resize structure.

 

Environment

VMware HCX 4.10.x, 4.11.x

 

Cause

The disk expansion workflow for HCX 4.10+ requires manual redirection of database directories from the /common partition to the newly expanded /common_ext partition. The failure occurs because the kafka-db and postgres-db directories were not correctly migrated or symlinked, preventing the services from accessing their data stores.

Resolution

Use KB Increasing HCX Manager Disk Space for HCX Software Version 4.10 or Higher on a HCX manager that are 4. 10 and greater. 

Additional Information

Increasing HCX Manager Disk Space for HCX Software Version 4.10 or Higher