Fixing VMDIR inconsistencies with SSO domain repoints
search cancel

Fixing VMDIR inconsistencies with SSO domain repoints

book

Article ID: 376443

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • Fix VMware Directory Service (vmdir) database and Enhanced Linked Mode (ELM) replication inconsistencies by repointing vCenter Server instances to a new Single Sign-On (SSO) domain.
  • This procedure recreates the entire vmdir structure, resolving major database errors, machine ID mismatches, and replication partner failures that cannot be fixed via standard synchronization.
  • This KB uses an example of 3 vCenter Servers that are repointed from their current SSO domain to join a brand new single SSO domain.
  • Global permissions, custom local SSO users and groups, and any external identity source(s) need to be re-configured. 
  • There is no loss of UUIDs, certificates, roles, inventory permissions, VM parameters or performance data. Permissions on the vCenter object level are not lost as those are stored in VCDB.
  • To validate/identify inconsistencies in vmdir, refer the KB: Using vmdir_tool.py to identify vmdir/ELM replication inconsistencies.

Environment

  • vCenter Server (VCSA) 6.7
  • vCenter Server (VCSA) 7.x
  • vCenter Server (VCSA) 8.x

Cause

VMDIR database inconsistencies often occur due to taking unsupported VM snapshots while vCenter High Availability (VCHA) is enabled or when performing single-node restores/snapshot reverts within an ELM topology.

Resolution

Warning: Cross-domain repointing is not supported in VMware Cloud Foundation (VCF) environments and breaks SDDC Manager functionality.

  • Sample vCenters:
    • node_a.example.com
    • node_b.example.com
    • node_c.example.com

  • Sample SSO:
    • Domain Name: vsphere.local
    • DN:  dc=vsphere,dc=local
  1. Preparation and Data Collection

    Repointing vCenter recreates the SSO domain from scratch. Document the following items as they must be manually recreated post-repoint:
    • Global Permissions: Record all entries under Administration > Global Permissions.
    • SSO Metadata: Record custom local SSO users and groups.
    • Identity Sources: Document Active Directory/LDAP configuration.
    • LDU-GUID: On each vCenter node, run the following command and record the output:

      /usr/lib/vmware-vmafd/bin/vmafd-cli get-ldu --server-name localhost

    • Simultaneous Offline Snapshots: 

      1. Set the cluster DRS mode to Manual.
      2. Perform a Guest OS shutdown on all vCenter nodes in the ELM group.
      3. Snapshot all vCenter VMs simultaneously while powered off.
      4. Power on all nodes.

        NOTE: All nodes *must* be reverted at the same time. Failure to do so results in further replication problems. See VMware vCenter in Enhanced Linked Mode pre-changes snapshot (online or offline) best practice for more details.

  2. Repointing Procedure

    1. Repoint the First Node (e.g., node_a.example.com):

      cmsso-util domain-repoint -m execute --src-emb-admin Administrator --dest-domain-name vsphere.local

    2. Repoint Subsequent Nodes (e.g., node_b.example.com): Use the previously repointed node as the replication partner. Run the below from node_b:

      # Run Pre-check:

      cmsso-util domain-repoint -m pre-check --src-emb-admin Administrator --replication-partner-fqdn node_a.example.com --replication-partner-admin administrator --dest-domain-name vsphere.local

      #Execute Repoint:

      cmsso-util domain-repoint -m execute --src-emb-admin Administrator --replication-partner-fqdn node_a.example.com --replication-partner-admin administrator --dest-domain-name vsphere.local

    3. Continue to repoint the subsequent Nodes as required(e.g., node_c.example.com): Use the previously repointed node as the replication partner. Run the below from node_c:

      # Run Pre-check:

      cmsso-util domain-repoint -m pre-check --src-emb-admin Administrator --replication-partner-fqdn node_b.example.com --replication-partner-admin administrator --dest-domain-name vsphere.local

      #Execute Repoint:

      cmsso-util domain-repoint -m execute --src-emb-admin Administrator --replication-partner-fqdn node_b.example.com --replication-partner-admin administrator --dest-domain-name vsphere.local

    4. This creates a Bus Topology between all the 3 vCenters.
    5. (If required) Create an agreement between the first and last moved nodes (e.g., Node A and Node C) to ensure a complete ring topology. Run the below command from node c:

      /usr/lib/vmware-vmdir/bin/vdcrepadmin -f createagreement -2 -h node_c.example.com -H node_a.example.com -u administrator

  3. Post-Repoint Cleanup

    1. Run the following command on any one node to fix the internal Administrators group membership. Replace the highlighted entries:

      1. To get the correct LDU ID for the vCenter, run the following command:

        /opt/likewise/bin/lwregshell list_values '[HKEY_THIS_MACHINE\Services\vmdir]' | grep LduGuid

      2. To get the current SSO the vCenter is a part of, run the following command:

        /usr/lib/vmware-vmafd/bin/vmafd-cli get-domain-name --server-name localhost

        Note: (In this example, SSO domain is vsphere.local so, dc=vsphere, dc=local, change it accordingly if required)

      3. Run the below command: (if there are more than 3 vCenters, add an additional "dn" entry for the targeted VCSA node)

        /opt/likewise/bin/ldapmodify -x -D cn=Administrator,cn=Users,dc=vsphere,dc=local -W <<EOF
        dn: CN=SystemConfiguration.Administrators,dc=vsphere,dc=local
        changetype: modify
        add: member
        member: cn=Administrators,cn=Builtin,dc=vsphere,dc=local
         
        dn: CN=ComponentManager.Administrators,dc=vsphere,dc=local
        changetype: modify
        add: member
        member: cn=Administrator,cn=Users,dc=vsphere,dc=local
         
        dn: cn=node_a.example.com,ou=Domain Controllers,dc=vsphere,dc=local
        changetype: modify
        replace: vmwLDUGuid
        vmwLDUGuid: LDU-GUID
         
        dn: cn=node_b.example.com,ou=Domain Controllers,dc=vsphere,dc=local
        changetype: modify
        replace: vmwLDUGuid
        vmwLDUGuid: LDU-GUID
         
        dn: cn=node_c.example.com,ou=Domain Controllers,dc=vsphere,dc=local
        changetype: modify
        replace: vmwLDUGuid
        vmwLDUGuid: LDU-GUID
        EOF

    2. Re-join the vCenter Server to the Active Directory domain if applicable.
    3. Recreate local SSO users/groups and Global Permissions.
    4. Remove stale global permissions from the previous domain name via the Permissions tab in the vSphere UI if applicable. 
    5. Re-register external solutions (e.g., NSX, Site Recovery Manager, Aria Operations).

Additional Information