Fetching precheck fails in SDDC manager with error. "failed due to missing VMware Cloud Foundation interoperability data" or "500 internal server error"

search cancel

Fetching precheck fails in SDDC manager with error. "failed due to missing VMware Cloud Foundation interoperability data" or "500 internal server error"

book

Article ID: 385929

calendar_today

Updated On:

Products

VMware SDDC Manager

Issue/Introduction

Unable to run prechecks from the SDDC manager, you get the error - Fetching precheck report failed due to missing VMware Cloud Foundation interoperability data

Refreshing the SDDC UI, lets you proceed. However, running the precheck again errors with : 500 Internal server error.

You see the following error in SDDC manager lcm-debug.log:

2025-01-13T14:08:23.202+0000 ERROR [vcf_lcm,1d414c2260364ef2,002d] [c.v.e.s.e.h.LocalizableRuntimeExceptionHandler,http-nio-127.0.0.1-7400-exec-8] [M5P3A3]  
ASSESSMENT_INTERNAL_ERROR Operation failed, check operations manager service health, check lcm and operations manager log files com.vmware.evo.sddc.lcm.model.error.LcmException: Operation failed, check operations manager service health, check lcm and operations manager log files
        at com.vmware.evo.sddc.lcm.services.impl.AssessmentPrechecksServiceImpl.toLcmException(AssessmentPrecheksServiceImpl.java:266)

Environment

VMware Cloud Foundation 5.2.x

Cause

SDDC manager root partition is full. If you reboot the SDDC Manager, you will see postgres or another service which fails to start.

journalctl -xe will have similar loggings (example of the below case is with postgres service):

Jan 13 15:19:35  sh[4478]: + exit 0
Jan 13 15:19:35  pg_ctl[4483]: 2025-01-13 15:19:35.276 UTC 67852f07.1183 0    4483 1FATAL:  could not open log file "/var/log/postgres/postgresql-xxx.log": No space left on device.
Jan 13 15:19:35 pg_ctl[4483]: 2025-01-13 15:19:35.277 UTC 67852f07.1183 0    4483 2LOG:  database system is shut down
Jan 13 15:19:35 pg_ctl[4481]: pg_ctl: could not start server
Jan 13 15:19:35 pg_ctl[4481]: Examine the log output.
Jan 13 15:19:35 systemd[1]: postgres.service: Control process exited, code=exited, status=1/FAILURE
Jan 13 15:19:35 systemd[1]: postgres.service: Failed with result 'exit-code'.

The unit postgres.service has entered the 'failed' state with result 'exit-code'.
Jan 13 15:19:35 systemd[1]: Failed to start Postgres.
Subject: A start job for unit postgres.service has failed

Note: The partition utilizing 100% or the failed service issue may vary on different environments.

Resolution

Resolve the partition issues on the SDDC manager, and proceed with pre-checks. It should be successful.

Note: Take a snapshot before you proceed with making any changes or removing files.

1. SSH with vcf user to the SDDC manager, elevate to root user.

2. Run the below command to confirm validate the partitions. If a partition is 100%, then we must clear this out. 

df -h

3. Navigate around each directory to identify the folders/logs taking more space using the below commands - 

To estimate file space usage: du -h / --max-depth 1 

du -h <input partition here> --max-depth 1. (ex: for /data partition du -h /data --max-depth 1 or for /nfs/vmware/vcf/nfs-mount partition du -h /nfs/vmware/vcf/nfs-mount/ --max-depth 1)

To manually review within each directory run: du -sh * | sort -nr

4. After clearing the space, reboot the SDDC manager and re-try the update.

Feedback

thumb_up Yes

thumb_down No