vCenter services not starting and /storage/core partition fills with core.java heap dumps
search cancel

vCenter services not starting and /storage/core partition fills with core.java heap dumps

book

Article ID: 435024

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

After a storage array failure or unexpected power loss, the following symptoms may be observed on the vCenter Server Appliance (VCSA):

 

  • Critical services such as vc-ws1a-broker and vmware-updatemgr fail to start.
  • Log files (e.g., vpxd.log, ws1a-broker log directory) contain errors referencing AuditContext with null values for ID, trace, or tenant: 
    • 'Internal error during AuditContext initialization: ID is null, trace is null, tenant is null.'
  • Since a service isn't starting, grep the vmon.log for the service that is not starting, in this case ws1a service:
    • grep -i vc-ws1a-broker /var/log/vmware/vmon/vmon*.log
    • Results that indicate an issue:
      • <vc-ws1a-broker> Skip service health check. State STOPPED, Curr request 0
      • <vc-ws1a-broker> Service api healthcheck command returned unknown exit code 1
  • Example message from /var/log/vmware/vpxd/vpxd.log (current log):
    • Date/Time stamp info vpxd[406171] [Originator@6876 sub=vmomi.soapStub[29] opID=LicenseClientProcessInventoryLoadedAsync-4eccd72d] SOAP request returned HTTP failure; <<cs p:00007f74940021d0, TCP:localhost:1080>, /ls/sdk>, method: searchNotifications; code: 500(Internal Server Error); fault: (cis.license.fault.NotAuthenticatedFault) {
      -->    faultCause = (vmodl.MethodFault) null,
      -->    faultMessage = <unset>
      -->    msg = "Received SOAP response fault from [<<cs p:00007f74940021d0, TCP:localhost:1080>, /ls/sdk>]: searchNotifications
      --> Authentication result: Missing session auth data"
      --> }
      Date/Time stamp  info vpxd[406171] [Originator@6876 sub=vmomi.soapStub[29] opID=LicenseClientProcessInventoryLoadedAsync-4eccd72d] SOAP request returned HTTP failure; <<cs p:00007f74940021d0, TCP:localhost:1080>, /ls/sdk>, method: querySystemTime; code: 500(Internal Server Error); fault: (cis.license.fault.N
      otAuthenticatedFault) {
      -->    faultCause = (vmodl.MethodFault) null,
      -->    faultMessage = <unset>
      -->    msg = "Received SOAP response fault from [<<cs p:00007f74940021d0, TCP:localhost:1080>, /ls/sdk>]: querySystemTime
      --> Authentication result: Missing session auth data"
      --> }
  • Another example from vpxd.log:
    • Date/Time stamp error vpxd[292205] [Originator@6876 sub=Default opID=wcp-vCLS-1] [VpxLRO] -- ERROR lro-7445 -- 52ad7bb0-969c-718a-6ad0-e2b69b2de488(52dcf195-217e-9c87-9b90-1b02219b57fb) -- VpxSettings -- vim.option.OptionManager.queryView: :vim.fault.InvalidName
      --> Result:
      --> (vim.fault.InvalidName) {
      -->    faultCause = (vmodl.MethodFault) null,
      -->    faultMessage = <unset>,
      -->    name = "config.vcls.clusters.",
      -->    entity = <unset>
      -->    msg = ""
      --> }
      --> Args:
      -->
      --> Arg name:
      --> "config.vcls.clusters."
  • Example message from /var/log/vmware/ws1a-broker/accesscontrol-service.log
    • [Date/time stamp] WARN  vCenterServerName:accesscontrol (vert.x-eventloop-thread-27) [-;-;-;-;-;-;-] com.vmware.vidm.audit.context.AuditContext - AuditContext has not been cleared for PT[numerical value indicating uptime of service] (Enable debug to investigate) - Audit[Id] [Unique ID], Timestamp: (Epoch Time), TenantId:HWS] / Request[Id:null] Trace:null, Tenant:null, Thread: vert.x-eventloop-thread-27] StackTop - {Enable debug}
  • Example message from other access log under /var/log/vmware/ws1a-broker/ :
    • Time/Date stamp,550 WARN  vCenterName:crypto (vert.x-eventloop-thread-1) [;;;;] com.vmware.vidm.common.async.RetryCompletableFuture - Failed after max retries: 0 java.util.concurrent.CompletionException: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:10114
  • Running the vSphere Diagnostic Tool (VDT) or vCert may indicate certificate parity issues or solution user inconsistencies that cannot be resolved through standard scripts.

 

Environment

vSphere vCenter 8.x

 

Cause

This issue can be caused by filesystem or database corruption resulting from a storage-layer failure. The `AuditContext` null errors indicate that the underlying identity or configuration metadata has become inconsistent, preventing the Workspace ONE Access broker and associated services from initializing correctly.

 

Resolution

Because these errors indicate corruption and manual repair of individual services is often unsuccessful. 

  • Restore from File-Based Backup
    • The recommended resolution is to restore the vCenter Server Appliance from a known-good file-based backup. 
    • Deploy a new VCSA of the same version and build. Documentation for file based backup and restore.
    • Post-Restore Verification
      • After the restore is complete:
      • Log in to the VAMI (`https://<vCenter_FQDN>:5480`) to confirm all services are running.
      • If services fail to start due to certificate expiration that occurred during the downtime, use the **vCert** tool to replace the expired certificates [vCenter Restore Certificate Issues]