OPS/MVS 2 OPSOSF Rules issuing commands at same time causes a false alert

book

Article ID: 197943

calendar_today

Updated On:

Products

CA OPS/MVS Event Management & Automation

Issue/Introduction

We have two separate TOD rules running at the same time, each call a different ops rexx routine. One routine runs the following command "D C" the other routine runs command "D M=CPU". Some how the queue contents from the "D C" command is causing the "D M=CPU" to issue false alerts. Again these are two separate routines with two separate ASID numbers but somehow the "D C" output is being pulled in by the "D M=CPU" routine causing this routine to parse through invalid data and issue a false alert. We were seeing this on four lpars in a plex. By refreshing the TOD.HCHKCPU rule this allowed everything to start working normally on three of the lpars in the plex. We left one lpar alone to help determine what the root cause is. Below is a small screen shot of the log.

20AUG 19:06:00 007D OPS3724O TSO TOD.HCHKCPU Sent CMD=OI HCHKCPU
20AUG 19:06:00 007A OPS3724O TSO TOD.CHKCONS Sent CMD=OI CHKCONS

20AUG 19:06:00 007D OPS3092J OI HCHKCPU
20AUG 19:06:00 007A OPS3092J OI CHKCONS

20AUG 19:06:00 007A OPS1181T OPSOSF   OPSS (*Local*) MVS N/A CHKCONS D C
20AUG 19:06:00 007A D C                                                 
20AUG 19:06:00 007A OPS3092J READY
20AUG 19:06:00 007D OPS1181T OPSOSF   OPSS (*Local*) MVS N/A MVSCMD DISPLAY M=CPU
20AUG 19:06:00 007D DISPLAY M=CPU       
20AUG 19:06:00 007D OPS1370T OPSOSF   X'0000' X'8000' X'0400' NONE  300 xxxxxxxx   CPU online count is below the threshold of 3,
20AUG 19:06:00 007D VSA1501W CPU online count is below the threshold of 3,                                                      
20AUG 19:06:00 0001 IEE174I 19.06.00 DISPLAY M 081                                                                      
20AUG 19:06:00 0001 CORE STATUS: HD=Y   MT=2  MT_MODE: CP=1  zIIP=2                                                     
20AUG 19:06:00 0001 ID    ST   ID RANGE   VP  ISCM  CPU THREAD STATUS                                                   
20AUG 19:06:00 0001 0000   +   0000-0001  M   FC00  +N                                                                  
20AUG 19:06:00 0001 0001   +   0002-0003  L   0000  +N                                                                  
20AUG 19:06:00 0001 0002   +I  0004-0005  M   0200  ++                                                                  
20AUG 19:06:00 0001 0003   -   0006-0007                                                                                
20AUG 19:06:00 0001                                                                                                     
20AUG 19:06:00 0001 CPC ND = 008561.T01.IBM.02.0000000xxxxx                                                             
20AUG 19:06:00 0001 CPC SI = 8561.512.IBM.02.00000000000xxxxx                                                          
20AUG 19:06:00 007D GLVJOBID.ALLTEXT.xxxxxxxx EC.ECWTOFWD GLVJOBID.ALLTEXT.xxxxxxxx                                     
20AUG 19:06:00 007D GLVJOBID.ALLTEXT.xxxxxxxx EC.ECWTOFWD  xxxxxxxx CPU online count is below the threshold of 3,       
20AUG 19:06:00 0001          Model: T01                                                                                 
20AUG 19:06:00 0001 CPC ID = 00                                                                                         
20AUG 19:06:00 0001 CPC NAME = CCPUBZ15                                                                                 
20AUG 19:06:00 0001 LP NAME = LBN1       LP ID =  5                                                                     
20AUG 19:06:00 0001 CSS ID  = 0                                                                                         
20AUG 19:06:00 007D please check CPU(s)  on LBN1.                                                                       
20AUG 19:06:00 007D GLVJOBID.ALLTEXT.2E256A70 EC.ECWTOFWD xxxxxxxx CPU online count is below the threshold of 3,  xxxxxx
20AUG 19:06:00 0001 MIF ID  = 5                                                                                         
20AUG 19:06:00 0001                                                                                                     

Environment

Release : 13.5

Component : OPS/MVS

Resolution

The problem with the CPU and Console checking REXX exec's is that the same console is being used and that both commands were issued at the same time, so the output could be mixed together.  This has always been the case with issuing host commands to the same console.  In addition, the two MVS commands involved here do not place the message id in each response message.  

The most efficient thing to do would be to use an OPS/REXX function to get the needed information, as this method does not use a console to collect the information.  OPS/MVS has the OPSCPU() and OPSINFO() functions that can query on the items that the 'D M=CPU' command provides.  There is not any equivalent OPS/REXX for detailed console information. 

Another option is to issue one of the commands to a different OPS/MVS allocated console.  In this case it is recommended to stagger the TOD rule for one of the two TOD rules by, for instance 00:02:25 - since both rules are calling REXX execs this process also could be delayed depending upon the OSF queue of recently called but not yet executed execs/commands.