Update Manager consumes all memory on vCenter when utilizing vLCM desired state images with many additional components

search cancel

Update Manager consumes all memory on vCenter when utilizing vLCM desired state images with many additional components

book

Article ID: 312056

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

During operations that trigger recommendations updates, Update Manager consumes all the memory in VC, often leading to a crash and creation of coreDump.##### files.
The cluster is configured with a desired state image and has several additional components configured, especially if these components are not up to date.
The /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server.log ends with a memory allocation failure that is preceded by the recommendation engine generating image combinations a few minutes earlier.

2022-12-08T11:00:12.208 info vmware-vum-server[50005] [Originator@6876 sub=RecommendationEngine::ReUtil] [reUtil 108] GenerateImageUnitsCombination: Generating ImageUnit combinations
2022-12-08T11:05:34.821 error vmware-vum-server[50005] [Originator@6876 sub=Default] Unable to allocate memory

Environment

VMware vCenter Server 7.0.x
VMware vCenter Server 8.0.x

Cause

This issue occurs when the vLCM recommendation engine has too many possible component combinations available and must consume a very large amount of memory to calculate all of them. This is especially apparent when the additional components have several new versions available to them, as each version exponentially adds to the total.

Resolution

This issue is resolved in vSphere 8.0 Update 3

Workaround:
Reduce the number of additional components being manually managed in the vLCM image.
If many additional components are necessary for your environment, it will be important to ensure each one of them is updated very regularly. The more versions of a specific component that become available, the more likely this issue will be encountered.

There could be scenarios, where it might take some time to trim down the additional components in the spec and during this intermediate time, Recommendation engine (RE) automatically triggers a new Generate RE image workflow. RE automatically triggers a new generation workflow when the below scenario occurs

1. When a new desired image is set on the cluster.
2. When there is a content change in the VC Depot.

Note: When the above automatic RE generation workflow kicks in, user would be facing the same OOM issue.

In order for the RE engine to temporarily stop automatically generating recommended images, user/SRE needs to follow the below steps:

1. Login to VCDB using the VUMuser.
Fetch the password using the following command:
cat /usr/lib/vmware-updatemgr/bin/configvalues.txt|grep db_password;
Then use the passed value for the next command;
/opt/vmware/vpostgres/current/bin/psql -d VCDB -U vumuser

Note: If accessing Database fails with FATAL: Peer authentication failed for user "vmuser", refer to Accessing the vSphere Lifecycle Manager Database using vumuser fails with FATAL: Peer authentication failed for user "vumuser"

2. Ensure there are no records in the recommendation engine tables pm_recommendation_spec and pm_recommendation_info for any cluster as of now.
select * from pm_recommendation_spec;
select * from pm_recommendation_info;

If there are any records in the table, Execute the below commands to delete those entries.
delete from pm_recommendation_spec;
delete from pm_recommendation_info;

3. Restart VUM service
service-control --restart vmware-updatemgr

Once you have addressed the additional components size in the spec, you can manually trigger a Generate image recommendation workflow. Subsequently automatic RE image generation workflow would also resume.

Feedback

thumb_up Yes

thumb_down No