After Upgrading or Installing vCenter Server 6.7 U3b or vCenter Server 7.0, lsassd Frequently Core Dumps and Users Fail to Login with Invalid Credentials
search cancel

After Upgrading or Installing vCenter Server 6.7 U3b or vCenter Server 7.0, lsassd Frequently Core Dumps and Users Fail to Login with Invalid Credentials

book

Article ID: 318866

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • Logging in fails for users with invalid credentials.  

    /var/log/messages shows the following errors for offline domains:
 
[YYYY-MM-DDTHH:MM:SS] vCenterFQDN lsassd[48897]: 0x7f3dd0fcb700:Domain 'DomainFQDN' is now offline
[YYYY-MM-DDTHH:MM:SS] vCenterFQDN lsassd[48897]: 0x7f3dd0fcb700:Detected domain 'DomainFQDN' offline. Some group information from this domain might be missing.

    /var/log/messages shows the following errors indicating lsassd has crashed:
[YYYY-MM-DDTHH:MM:SS] vCenterFQDN lwsmd: Restarting dead service: lsass (attempt 1/2)
[YYYY-MM-DDTHH:MM:SS] vCenterFQDN lwsmd: Starting service: lsass

    /var/core directory has multiple lsassd core files. e.g. core.lsassd.1541
 
  • Check under messages log file if there are one or more logs. You can test by using the following command:
# grep " is now offline" /var/log/vmware/messages | less
 
  • Alternatively you can also 'cat' the message log and search for messages  "<domain name> is now offline"
 
[YYYY-MM-DDTHH:MM:SS] vCenterFQDN lsassd[64669]: 0x7effcffff700:Domain 'domain.com' is now offline2020-04-21T15:16:10.207009+00:00 vCenterFQDN lsassd[64880]: 0x7f7619ee6700:Domain 'domain2.com' is now offline
[YYYY-MM-DDTHH:MM:SS] vCenterFQDN lsassd[64880]: 0x7f7619ee6700:Domain 'domain3.com' is now offline



Environment

VMware vCenter Server 7.0.x
VMware vCenter Server 6.7.x
VMware vCenter Server Appliance 6.7.x

Cause

This issue was introduced in vCenter Server 6.7 U3b (15129973) while modifying how likewise handles offline domains.  Likewise can return a partial set of group memberships or none for any user associated via group membership with a trusted domain in an offline condition.  This issue also impacts vCenter Server 7.0 GA.

Resolution

This issue is resolved in vCenter Server 6.7 U3g & vCenter Server 7.0b, see Download Broadcom products and software

Workaround:
Before executing the Workaround make sure to take offline (powered off) snapshots of all Platform Service Controllers (PSC's) and vCenters.   This is standard best practice before making any manual changes to the PSC VMDIR database.

  1. Login using SSH to an impacted external PSC or embedded VCSA.
  2. Exclude offline domains by adding to DomainManagerExcludeTrustList.
/opt/likewise/bin/lwregshell set_value '[HKEY_THIS_MACHINE\Services\lsass\Parameters\Providers\ActiveDirectory]' "DomainManagerExcludeTrustsList" "Offline domain FQDN" "Offline domain FQDN"

For example,

/opt/likewise/bin/lwregshell set_value '[HKEY_THIS_MACHINE\Services\lsass\Parameters\Providers\ActiveDirectory]' "DomainManagerExcludeTrustsList" "NASA1.domain.cloud" "APJ1.domain.cloud"

Note: To gather domains that are offline, refer to messages in the symptoms of this KB (/var/log/messages) or run /opt/likewise/bin/lw-lsa get-status.
 
  1. Restart likewise
/opt/likewise/bin/lwsm restart lwreg
 
  1. Check if the DomainManagerExcludeTrustsList has the excluded domains added to it in the registry.
/opt/likewise/bin/lwregshell list_values '[HKEY_THIS_MACHINE\Services\lsass\Parameters\Providers\ActiveDirectory]'
 
  1. Clear the cache.
/opt/likewise/bin/lw-lsa ad-cache --delete-all
 
  1. Try to login with user that was failing.
  2. Confirm group memberships are correct.
/opt/likewise/bin/lw-lsa list-groups-for-user <username>