Testing for Active Nodes in the Cluster - Clarity

book

Article ID: 51927

calendar_today

Updated On:

Products

Clarity PPM On Premise

Issue/Introduction

The following are symptoms of communication issues between Clarity nodes:

  1.  <server_url>/niku/nu#action:security.logs / CSA server dropdown does not show all nodes in the cluster or shows the server:port multiple times
  2. The command: admin tower does not list all nodes/services
  3. Jobs will error out with: ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
  4. Processes not starting or completing
  5. Processes launching multiple times on each server running the BG service, even though the Process > Start Options > Start Event field (Do not start a new process if one is already running) is already checkmarked.

Cause

The communication between the Clarity nodes/servers are not working.

Environment

Clarity PPM On Premise

Resolution

Resolution 1: JDBC Ping

Prior to Clarity 15.4.1, multicast messaging at the router layer was required so that Clarity cluster services could communicate. To reduce network congestion, Clarity uses the Internet Group Management Protocol (IGMP) v3 protocol used by IPv4 systems to report IP multicast group membership to neighboring multicast routers. The problem encountered with this method is that some networks cannot support multicast due to provisioning, router, and cost constraints. Hybrid cloud providers also encounter this limitation as it restricts multicast. JDBC Ping works in Clarity 15.3+ and will be used going forward.

For information on how to enable JDBC Ping, please refer to the following documentation: CSA: Configure JDBC Ping As An Alternative to Multicast (On-Premise Only)

Steps to test and ensure JDBC ping is working correctly in multi-application environment.
1. Run Tomcat access log import/analyze job completed successfully. Imported logs.
2. Ensure all servers are appearing in NSA and listing correctly, as well as in the Application (security.logs) once. No duplicates.
3. Check Projects are opening correctly from UI.
4. Ensure the beacon services have started successfully on ALL servers in the cluster (“service start beacon” on all application servers, followed by “service status beacon” on each application server to confirm the beacon service has started and is staying up)
On the server running the NSA:
a) Run the following command: admin tower
b) Next run this command:
> trace on
You should then start to see packets of traffic being transferred between ALL servers in the Clarity cluster.  If packets of data are not being sent between ALL servers then we have a problem. PPM has two topics used for messaging. CLRTY for application specific and CLRTY-SA for system admin level messaging. 
To see the messages flow for a specific topic, group needs to be configured in tower console.
By default, tower group is mapped to CLRTY-SA topic
> group CLRTY
> group CLRTY-SA
5. Review Background/Application/Beacon logs for failure details occurring
6. All members must share a common NSA password - Run “admin password” to reset if needed
7. If the server has multiple IP addresses (NICs) then configure the Beacon to bind to a single specific IP address (Note: this is not required to ‘enable’ multicast but if the Beacons are not configured correctly then that will prevent servers being visible in the NSA) – Stop, Remove, Add and Deploy the Beacon Service after making any changes.

Resolution 2: Configure the load balancer
Directly access at least one of the servers instead of using the load balancer URL. If the issue is not reproduced, then the issue is due to the load balancer directing traffic to the proper node/server. 
The network team will need to configure the load balancer.