Troubleshooting: Spooler Errors and "unable to reset out-queue" Alarms
search cancel

Troubleshooting: Spooler Errors and "unable to reset out-queue" Alarms

book

Article ID: 92998

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management On-Premise (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

These spooler errors started appearing around the same time for six different servers where the robot is installed.

Apr 25 06:00:20:402 [8880] spooler: QueueAdmin - expire old messages
Apr 25 06:41:08:288 [8880] spooler: rdbReset - open (q2.rdb) failed (Invalid argument)
Apr 25 06:41:08:288 [8880] spooler: FlushMessages - unable to reset out-queue
Apr 25 06:41:08:293 [8880] spooler: FlushMessages - aborting

I have checked the firewalls, file permissions, disk space  and they all seem to be fine. I have also deleted both of the .rdb files.. Everything has been reset or restarted at least once to fix the problem.


Environment

  • UIM 23.4.x
  • Robot/spooler 23.4.x

Cause

  • This alarm will be published if the spooler probe fails to create a new queue file (q1.rdb/q2.rdb)

Resolution

In general, these alarms are usually due to an environmental issue which is preventing the spooler probe from accessing one of these queue files. The out-queue corresponds to the q2.rdb file. The spooler will not clear this alarm, even if the condition that causes the alarm to be generated in the first place is cleared on the robot.

If you continue to receive alarms and QoS from this robot and you see that the \Nimsoft\robot\q1.rdb file is not continuously growing in size, then the intermittent problem is most likely environmental.

It is advisable to have the robot System admin check for these common causes for this alarm:

  • There is no hard drive space on the robot disk volume or the drive is read-only
  • The robot does not have proper permissions
  • There are 2 spooler services running
  • There is AV software scanning/locking the files in the nimsoft directory which is preventing the spooler service from functioning correctly.
  • (you must create a full exception for all UIM/Nimsoft programs on the robot)
  • There is a file system problem on the system where the robot is installed
  • Backup software is locking the files

Immediate Fix: Clear Corrupted Queue Files

If the q1.rdb file is growing and no messages are being sent, follow these steps:

  1. Stop the Nimsoft Robot Watcher service.

  2. Navigate to the \Nimsoft\robot directory.

  3. Backup the q2.rdb file (move it to a temp folder).

  4. Delete the original q2.rdb from the robot directory.

  5. Restart the Nimsoft Robot Watcher service. The spooler will automatically recreate a healthy queue file.


An analysis of the archived q2.rdb may show corrupt data via illegal characters which could have been disrupting the normal data flow.

Additional Information

Additionally, robot machines that generated spooler alarms may be:

  • Unreachable via ping
  • Decommissioned (non-existent / unknown)
  • Moved to test environment
  • Have host intrusion protection IPS installed on them (e.g., in /opt/Symantec)
  • If the problem is not seen on a higher loglevel and there is no impact that the client can tell these message can be ignored.