Missing solution users from vCenter for the respective machine ID after running fixpsc
search cancel

Missing solution users from vCenter for the respective machine ID after running fixpsc

book

Article ID: 417625

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • After running the fixpsc script, the machine ID in vCenter changed, new machine ID solution users were not present in the respective vCenter, which was confirmed by running the following command:

/usr/lib/vmware-vmafd/bin/dir-cli service list

  • This leads to inconsistency between configuration files, service registrations, and certificates associated with the old machine ID.
  • Experiencing permission issues such as hosts not been able to be put in maintenance mode etc, vCenter running sluggish and failures editing virtual machines.
  • Following error seen while running vCert to update the vCenter server solution users certificates with new machine ID : 
    Operation failed: Unable to update machine-<VCenter UUID> solution user certificate in VMDir.

Cause

  • The fixpsc script updated the vCenter’s machine ID in vmafd and vmdir, but related configuration files, solution users and service registrations were not automatically updated.
  • Since service and certificate records in vCenter are tightly coupled with the machine ID, any mismatch prevents proper service registration and certificate renewal.
  • Evidence from /etc/vmware-vpx/vpxd.cfg, /etc/vmware/install-defaults/sca.hostid, and Lookup Service (lstool.py) confirmed lingering references to the old machine ID, resulting in the failure of solution user registration and certificate renewal processes.
     
    • lstool.py tool Commands used to export the service registrations into a file :
      • /usr/lib/vmware-lookupsvc/tools/lstool.py list --url https://localhost/lookupservice/sdk --no-check-cert > /storage/core/psc.txt
    • Check the service registration type for the old and new machine IDs :
      • awk 'BEGIN{IGNORECASE=1} /Service Type:/ {st=$0; for(i=1;i<=6;i++){ if(getline>0 && $0 ~ /Owner ID:/){ sub(/.Service Type:[ \t]/, "", st); oid=$0; sub(/.Owner ID:[ \t]/, "", oid); key=st "|" oid; counts[key]++; if(length(st)>max) max=length(st); break } }} END {PROCINFO["sorted_in"]="@ind_str_asc"; for(k in counts){ n=counts[k]; split(k, parts, "|"); svc=parts[1]; oid=parts[2]; gsub(/^[ \t]+|[ \t]+$/, "", svc); gsub(/^[ \t]+|[ \t]+$/, "", oid); printf "%-4s Service Type: %-*s | Owner ID: %s\n", n, max, svc, oid } }' /storage/core/psc.txt

Resolution

Important recommendation which need to be followed and understood before performing the resolution steps :

  • When using multiple vCenter Server Appliances (VCSA) in the same Single Sign-on Domain, replicating in Enhanced Linked Mode (ELM), there is high potential of corruption of the domain if snapshots of the appliances are taken while they are in running state. Use of offline snapshots in ELM deployments is very strongly recommended for a safe rollback point. This means all appliances should be gracefully shut down, and snapshots need to be taken while the VCSAs are in powered off state (at the same time).
  • If any change must be reverted, restore all of the nodes in the ELM deployment to this offline/consistent snapshot state. Only start powering the restored nodes back on after all of them have been restored from the snapshots.
  • Doing otherwise can and will introduce inconsistencies between the local VM Directory instances of the embedded platform service controllers, which will prevent the nodes from successfully replicating with each other.

To resolve the issue, follow the steps below :

  • Update the vCenter configuration file (/etc/vmware-vpx/vpxd.cfg) with the new machine ID:
    • Check the correct machine ID which is also the new machine ID.
      • /opt/likewise/bin/lwregshell ls "[HKEY_THIS_MACHINE\Services\vmdir]" | grep MachineGuid | awk '{print $2,$NF}'

"MachineGuid" "38#######-####-####-####-##########5#7"

      • Get the old machine ID of the vpxd service account
        • cat /etc/vmware-vpx/vpxd.cfg | grep -i "vpxd-" 

vpxd-32#######-####-####-####-##########d#4

      • The UUID of the new machine ID should match the UUID in the vpxd service account name,. Since there is a mismatch, update the machine ID in vpxd.cfg to match the vpxd service account by running the following commands
        • Backup the vpxd.cfg file 
          • cp /etc/vmware-vpx/vpxd.cfg /storage/core/vpxd.cfg
        • Edit the vpxd.cfg file 
          • vi /etc/vmware-vpx/vpxd.cfg

         <name>vpxd-38#######-####-####-####-##########5#[email protected]</name>

  • Change the /etc/vmware/install-defaults/sca.hostid using the new machine ID after backing up the sca.hostid file :
    • cp /etc/vmware/install-defaults/sca.hostid /storage/core/sca.hostid ; echo "38#######-####-####-####-##########5#7" > /etc/vmware/install-defaults/sca.hostid
  • Renew the vCenter service registrations using lsdoctor script :

    • python lsdoctor -r

  • Reboot the vCenter appliance.

  • Recreate the solution users on vCenter using:

    • python lsdoctor -u

  • Reset the vCenter solution user certificates using vCert script. ( Options 3. Manage certificates >  2. Solution User certificates )
  • Restart the vCenter services : 
    • service-control --stop --all && service-control --start --all

Note : Incase assistance is needed with the above steps, reach out to Broadcom support.