commmgr service fails to start after Smarts NCM 25.4.6 upgrade due to package customization
search cancel

commmgr service fails to start after Smarts NCM 25.4.6 upgrade due to package customization

book

Article ID: 439093

calendar_today

Updated On:

Products

VMware Smart Assurance Network Observability

Issue/Introduction

Following an upgrade of a Smarts NCM Device Server (DS) to version 25.4.6, the commmgr service (CommMgr) fails to initialize or enters a restart loop. The logs indicate a fatal error during the parsing of device operations:

text
 
Apr 29 17:22:37 2084678016#1: ===>Parsing Device Operations: /opt/smarts-ncm/XML/BasicOperations.dop
Apr 29 17:22:45 2084678016#1: Script Manager shutting down... 
Apr 29 17:22:45 2084678016#1: Fatal error...shutting down
Apr 29 17:22:45 2084678016#1: CommMgr Shutdown complete...

System logs may also show SELinux blocking the voyence service execution:

  • SELinux is preventing /usr/bin/bash from execute access on the file /opt/smarts-ncm/bin/service/voyence

Cause

This issue occurs when a customized package file, specifically CiscoApicSwitch.pkg, contains a syntax error (such as a stray < character).

Because BasicOperations.dop references these package files, a syntax error in the customized package causes the parsing of the entire device operation set to fail, leading to a fatal service shutdown.

SELinux interference may also prevent the service from executing its bash scripts properly after an upgrade.

Resolution

To resolve the commmgr startup failure, follow these steps to correct the package file and verify the environment:

Step 1: Fix the CiscoApicSwitch.pkg File

  1. Log in to the Device Server as an administrator.
  2. Locate and open the customized CiscoApicSwitch.pkg file for editing.
  3. Identify and remove the stray < character or any unclosed tags that were introduced during customization.
  4. Save the file.
  5. Synchronization Check: Ensure the corrected CiscoApicSwitch.pkg file is identical on both the Application Server (AS) and the Device Server (DS).

Step 2: Address SELinux Policy Restrictions

This command will make changes to your system. Review it carefully before running.

  1. Check the current SELinux status: getenforce
  2. If the status is Enforcing, set it to Permissive to confirm if it is blocking service execution: setenforce 0
  3. If the service starts in permissive mode, update your local SELinux policies to allow /usr/bin/bash to execute the /opt/smarts-ncm/bin/service/voyence script.

Step 3: Restart Services

This command will make changes to your system. Review it carefully before running.

  1. Restart the Smarts NCM services on the Device Server.
  2. Monitor the commmgr.log to confirm the service successfully parses the operations and remains stable.