A customer asked if it is possible to use VM:Operator to monitor service virtual machines for errors like program interrupt loops that cause the server to drop into CP.
CP indicates the detection of program interrupt loops by displaying a message on the console of the affected virtual machine. If the virtual machine is disconnected a human will likely not see the message and CP will eventually log off the virtual machine without producing a dump.
There are two ways to direct console messages to another virtual machine: Use either a secondary console (commonly called secuser) or an observed console. Observer messages look like secuser messages but being an observer does not automatically enable you to issue commands to the observed virtual machine, as is the case with secuser. Also secuser messages are delivered to secondary consoles only when the primary console is disconnected, whereas observer messages are always delivered. Secuser and observer mode are mutually exclusive: An observed virtual machine cannot also have a secondary console.
While observer messages look very much like secuser messages when displayed on a console, they are delivered to VM:Operator in a different *MSG IUCV message class. Secuser messages are class 8 but observer messages are class 3, which is also the message class used for asynchronous CP messages that are directed to the system operator. This has the potential to confuse existing automation, therefore we do not allow VM:Operator Observer Mode to be used on the system operator user ID. Fortunately that is not a serious restriction because you can run VM:Operator on other user IDs in addition to the system operator.
VM:Operator Observer Mode was introduced in Release 3.1 in order to support recording of all Linux virtual machine console messages in a syslogd application. It is delivered by means of solution RO64429 (along with informational solution RI64791). Observer Mode can also be used for other things in addition to its original purpose. An example of another use case is described later in this paper.
To use Observer Mode with VM:Operator r3.1, you must first install RO64429 and then deploy a new VM:Operator user ID. Finally, in the VMSERVER entry for that user ID, change the VMYSYS startup command by adding the OBSERVER parameter. For example, you would typically change VMYSYS to VMYSYS OBSERVER.
When running in Observer Mode, VM:Operator will detect observer messages and process them just as it always has processed secuser messages. When VM:Operator receives observed messages with the *MSG system service, they are edited to remove the observed user ID from the message text. That user ID is then used as the origin user ID for logging and routing purposes. Also, if the message happens to be a segmented message (that is, it contains X’15’ new line characters) VM:Operator will split the message into individual message lines which are logged and routed separately. Segmented messages not normally seen with messages from CMS user IDs, but other guest operating systems like Linux or TPF routinely write multiple , unrelated messages to the console in a single I/O operation.
Once you have a VM:Operator server set up to run in Observer Mode, you will want to select some user IDs to be observed. As noted above, these user IDs cannot also have secondary consoles. While you can use the SET OBSERVER command on the selected user IDs to specify the observer user ID, we have found that it is best to specify the observer user ID in the directory entry of the observed user ID because that way, the messages normally displayed on the console at logon time before CMS is IPLed will be observed. That information can sometimes be very valuable when diagnosing problems. For example, if a particular disk is required then you’ll know if CP cannot attach it at logon time for some reason. Specifying an observer user ID in the directory is done with the CONSOLE statement. Check the IBM z/VM CP Planning and Administration manual for the exact syntax.
Now that you have a user ID being observed by a VM:Operator virtual machine running in Observer Mode, you can make use of it in several ways. For example, a complete record of all observed messages will now be recorded in the system log, so the VM:Operator REVIEW window can be configured to display messages from a specific virtual machine or from all virtual machines. You can use these SYSLOG files to record service virtual machine messages for posterity. The continuity of this log will not be affected by closing the server’s spooled console and is therefore easier to find and less likely to be misplaced.
You can also setup LOGTABLE entries to detect and react to specific messages. For example, if you wanted to catch and react to the program interrupt loop message, HCPGIR453W, all you have to do is add an entry to the LOGTABLE file as follows:
SPAWN RECOVER MSG * *3 1 HCPGIR453W
Then, whenever any observed virtual machine experiences a program interrupt loop, your action routine, RECOVER VMOPER, will be executed. The macro could potentially notify the operator, issue the VMDUMP command and then restart the user ID by issuing the FORCE and XAUTOLOG commands.
Here’s an example of a possible RECOVER VMOPER macro:
/* VMDUMP and then restart the failed server */
parse arg server . message
'TEST CP MSG OP' server || ': received:' message
say 'Issuing VMDUMP to:' server
'TEST CP FOR' server 'CMD VMDUMP'
'TEST PROCESS WAIT 3'
say 'Forcing server:' server
'TEST CP FORCE' server
'TEST PROCESS WAIT 3'
say 'Restarting server:' server
'TEST CP XAUTOLOG' server
An easy way to test the macro is to issue it as a command on VM:Operator, specifying the server ID as the command parameter. Note that this macro uses the FOR command to issue the dump on the observed user ID. As specified, the dump will be found in that user ID’s reader. You should make sure that the VM:Operator user ID has the appropriate command classes to issue the FOR, FORCE and XAUTOLOG commands. If you don’t want to use the FORCE command you could substitute 'TEST CP FOR' server 'CMD LOGOFF' instead, however in z/VM 6.3 and earlier this will have the effect of displaying the LOGOFF message on the observer’s logon console if it is connected.
As stated above secuser and observer are mutually exclusive. If you have a server that you want to monitor and it is already setup to use the system operator as its secondary console, you can still make use of the RECOVER VMOPER macro listed above. Simply change the *MSG IUCV message class from 3 to 8.
In conclusion, VM:Operator Observer Mode adds another tool to the already rich set of features that you can use to automate and trouble shoot the service virtual machines in your system. You may also find it provides a number of other ways to satisfy your individual needs.