DE53884 - System Hang after many messages like Unable to resend broadcast message, slump has disconnected, out-of-sequence node broadcast


Article ID: 192731


Updated On:


CA Service Management - Service Desk Manager


In an Advanced Availability environment with multiple Application servers, randomly, processes are unable to communicate between servers with error "UNABLE TO RESEND BROADCAST MESSAGE" in the stdlog file; the system hangs when the number of messages is too high and the system becomes unresponsive.

Steps to Reproduce:

1.  Started executing a script to create 100 tickets on BG server.

2.  Connected to BG server. I am using VM hosted on an ESX server. Hence connected to ESX servers vsphere web client and uncheck the "Connected" checkbox to disconnect the slump and clck OK..


   On the BG server stdlog following message appears:

      05/07 23:43:04.76 sdmAA-BGSB1  slump_nxd            8496 SIGNIFICANT  list.c                 553 Node( Slump ID(581) has disconnected


3. Wait for 1 or 2 seconds and check the "Connected" check box to make sure slump is connected back.


   On the BG server we may see the logs as :


      05/07 23:43:04.95 sdmAA-ews24  slump_nxd            8496 SIGNIFICANT  list.c                 506 Node( Slump ID(581) has connected


4. Keep continuing the script to create tickets on the BG server.


   We might see below messages in stdlog of BG server.


      05/07 23:43:07.73 sdmAA-BGSB1  slump_nxd            8496 TRACE        server.c              5412 Resent broadcast message 443 to node sdmAA-APP1
      05/07 23:43:07.73 sdmAA-BGSB1  slump_nxd            8496 TRACE        server.c              5412 Resent broadcast message 444 to node sdmAA-APP1
      05/07 23:43:07.75 sdmAA-BGSB1  slump_nxd            8496 TRACE        server.c              5412 Resent broadcast message 445 to node sdmAA-APP1


   Wireshark log when Resent happens:


      15815    2020-05-07 23:43:07.732326    TCP    1628    54132 → 2101 [PSH, ACK] Seq=2533 Ack=865 Win=2101504 Len=1574
      15818    2020-05-07 23:43:07.735310    TCP    60    2101 → 54132 [ACK] Seq=865 Ack=4107 Win=2102272 Len=0


   Stdlog on App server shows below messages.


      05/07 23:43:08.12 sdmAA-APP1  slump_nxd            5928 TRACE        server.c              3323 Received node broadcast message from 579|prov#6516_bpvirtdb_srvr to *|*|cr_status_trans_history::DB_CHANGE
      05/07 23:43:08.12 sdmAA-APP1  slump_nxd            5928 WARNING      list.c                 586 Received out-of-sequence node broadcast 508 from - previous sequence was 442


After this whatever broadcast messages BG sends that will be rejected by App server by giving an error Received out-of-sequence.


As per the Wireshark traces, BG server is sending the messages (about DB_CHANGE) and same messages are appearing on the App server too.


Expected result:


   When "Resent broadcast message" happens at BG server , on App server the message should appear some thing like below.


      Received resent broadcast message "<< *(dg.pSequenceNumber) << " from node " << pNode->get_node_host() << "; resetting broadcast sequence".

Test Environment for the test

CA Service Desk Manager 17.1 - Rollup patches up to

Advanced Availability Configuration:  Background Server (sdmAA-BGSB1), more than 2 Application Servers one of which is sdmAA-APP1

O.S.: Windows Server 2016

Database: Microsoft SQL Server 2014




Product defect DE53884.


Release : 17.1 up through at least, 17.2 up through at least, 17.3 GA at least.



Debug test patch T56H126 (Linux) exists for resolving defect DE53884 on CA SDM 17.1 RU4.

The fix is to be integrated into future versions as soon as possible - please search the product documentation for DE53884 to determine the associated rollup patch that includes the fix.