domsrvr crashes after Windows patches and updates
search cancel

domsrvr crashes after Windows patches and updates

book

Article ID: 387432

calendar_today

Updated On:

Products

CA Service Desk Manager ServiceDesk CA Service Management - Service Desk Manager

Issue/Introduction

After some Windows patches and updates were applied over the weekend, the Service Point server became inaccessible.  

The stdlog shows many repetitions of the following error:

<date> <time> <servername> domsrvr                 4464 SIGNIFICANT  factory.c             1599 Factory acc_lvls has no last mod date attribute
<date> <time> <servername> pdm_rfbroker_nxd        2172 SIGNIFICANT  rfbroker.c            1599 Sent singleton broadcast message to node :<servername> with Slump ID : ##### Singleton msg type : 1 Singleton processName : pdm_tomcat-<servername> Singleton slumpid : ##### Singleton slumpHost: <servername>
<date> <time> <servername> domsrvr                 4464 SIGNIFICANT  factory.c             1599 Factory true_false has no last mod date attribute
<date> <time> <servername> pdm_rfbroker_nxd        2172 SIGNIFICANT  rfbroker.c            1599 Sent singleton broadcast message to node :<servername> with Slump ID : ##### Singleton msg type : 1 Singleton processName : pdm_tomcat-<servername> Singleton slumpid : ##### Singleton slumpHost: <servername>
<date> <time> <servername> domsrvr                 4464 SIGNIFICANT  factory.c             1599 Factory evtdlytp has no last mod date attribute
<date> <time> <servername> pdm_rfbroker_nxd        2172 SIGNIFICANT  rfbroker.c            1599 Sent singleton broadcast message to node :<servername> with Slump ID : ##### Singleton msg type : 1 Singleton processName : pdm_tomcat-<servername> Singleton slumpid : ##### Singleton slumpHost: <servername>
<date> <time> <servername> domsrvr                 4464 SIGNIFICANT  factory.c             1599 Factory crt has no last mod date attribute
<date> <time> <servername> pdm_rfbroker_nxd        2172 SIGNIFICANT  rfbroker.c            1599 Sent singleton broadcast message to node :<servername> with Slump ID : ##### Singleton msg type : 1 Singleton processName : pdm_tomcat-<servername> Singleton slumpid : ##### Singleton slumpHost: <servername>
<date> <time> <servername> domsrvr                 4464 SIGNIFICANT  factory.c             1599 Factory quick_tpl_types has no last mod date attribute
<date> <time> <servername> domsrvr                 4464 SEVERE_ERROR miscos.c               222 Signal SIGSEGV received - Exiting!
<date> <time> <servername> pdm_tomcat              4296 SIGNIFICANT  pdm_tomcat.c          1041 SERVICEDESK Tomcat was started on Wed Jan 29 10:23:16 PST 2025
<date> <time> <servername> slump_nxd               7028 ERROR        server.c              3785 Unable to handle fast-channel request from #####|pdmweb:<servername>#-#########################|PDMWEB to #####|web:local|slump_cbf; reason: Process web:local has not enabled fastchannel yet
<date> <time> <servername> pdm_text_nxd            7552 ERROR        pdm_text_nxd.c         770 Can't connect to server 'domsrvr' -  will retry in 30 seconds
<date> <time> <servername> bpnotify_nxd            7408 ERROR        bpnotify.c            1454 Unable to connect to domsrvr.  Retry in 30 seconds
<date> <time> <servername> rep_daemon             11560 ERROR        main.c                 216 Can't connect to domsrvr.  Will retry in 30 seconds.
<date> <time> <servername> kt_daemon               4756 ERROR        main.c                 187 Can't connect to domsrvr.  Will retry in 30 seconds.
<date> <time> <servername> pdm_rest_nxd            5224 ERROR        create_rest_proc.c     215 Can't connect to domsrvr.  Will retry in 30 seconds.
<date> <time> <servername> kpi_sys_daemon          4868 SIGNIFICANT  main.c                 256 Can't connect to domsrvr. Will retry in 20 seconds.

stdlog shows that domsrvr crashes in this way and restarts every 60 to 90 seconds

Environment

SDM 17.4 RU3

Windows Server 2019 Data Center Edition

Cause

Using the instructions here (Working with Broadcom Support to troubleshoot a "Crashing" or "Hanging" CA Service Desk Manager Process) procdump.exe was used to collect a crash dump of the domsrvr process.

The exception found in the crash dump is:    Unhandled exception thrown: read access violation.
**p_domattr** was 0xFFFFFFFFFFFFFEEF. occurred

Exception code: 0xC0000005

Exception 0xC0000005 on Windows systems signifies an "Access Violation" error, meaning a program is trying to access memory that it is not allowed to, often due to faulty RAM, corrupted system files, or issues with the program itself; essentially, the computer cannot properly access information stored in a specific location of the RAM

Resolution

These steps were recommended:

  • To delete the shared memory content of SDM from NX_ROOT\site\shm\
  • To run "sfc /scannow" and then to restart the service or restart the server. 

After doing these things, no further problems were reported.

Additional Information

A Windows Update pushed to the server seems to have created this problem.  Though not every Windows update will cause this type of situation, it is always good to back up your servers before applying patches and test them afterward to assure there are no conflicts or problems created.

See these articles for more valuable information:

Working with Broadcom Support to troubleshoot a "Crashing" or "Hanging" CA Service Desk Manager Process

Using ProcDump Quick Guide