Description:
Depending on the workload, the process on the Domain Manager receiving the feedback from the Scalability Servers is not able to process all of the incoming messages as fast as it should.
For example, when a full inventory of SM installed packages is requested, all the inventory of installed packages is received by the Installation Manager (sdmgr_im) from each Sdserver process from all the Scalability Servers. These messages are enqueued and Installation Manager processes them sequentially.
The reason for the Job Status not being updated is due to the fact that the job status messages are enqueued in the existing Installation Manager message queue.
Solution:
In order to allow faster Job Status feedback update, a new Priority queue was introduced in DSM 11.2 C2 and needs to be activated on all Scalability Servers by running the following command:
ccnfcmda -cmd SetParameterValue -ps /itrm/usd/server -pn AllowUseNew_SD_IM_Queue -v 1
This command will instruct the Sdserver process to send all Job Status messages to the Priority queue.
On the Domain Manager, a new parameter was introduced to balance the activity of Installation Manager in terms of the number of messages processed at every cycle which can be customized by using the following command:
ccnfcmda -cmd SetParameterValue -ps /itrm/usd/Manager -pn SDIM_PriorityMsgPerCycle -v 50
By default, the Installation Manager processes one message from the Normal queue and two from the Priority queue at every cycle. In the above example, we have instructed Installation Manager to process one message from the Normal queue and 50 from the Priority queue at every cycle.
The Installation Manager after processing one message from the Normal queue, will process 50 messages from the Priority queue and the pending Job Status messages will be updated faster.
Another important aspect to check is the 'Stagecheck' process. In USD 4.0, the main communication process between Staging Servers and Local Servers was Stagecheck which runs every 30 minutes by default. In CMS 11.2, the communication is done via internal messaging and Stagecheck still runs actively every 30 minutes but only as a fallback for messages not sent to the Domain Manager due to communication errors. When Sdserver on the Scalabilty Server cannot send a message to the Domain Manager it saves it on disk till the next Stagecheck is run.
As Stagecheck is now a fallback procedure, it is recommended to keep the default configuration parameters in place and not change the Stagecheck Mode to 'True' as this will stop the internal messaging communication and will only send information to the Domain Manager using Stagecheck every 30 minutes. The two configuration policies involved can be found under DSM > software delivery > Scalability Server and are:
Note: This document is applicable only to versions 11.2 C2, 11.2 C3 and 11.2 SP4. These settings are enabled by default in version 12.0.