Performance Issues - Releated to ROC & NES

book

Article ID: 186122

calendar_today

Updated On:

Products

CA Release Automation - Release Operations Center (Nolio) CA Release Automation - DataManagement Server (Nolio)

Issue/Introduction

Our one customer raised a few pointers related to the Performance of ROC Console & NES.

1. They currently have an Active-Passive setup for NAC & at times the Console/Functions are slow. Hence they are checking for any Performance improvement advice on the NAC. Is it possible to run Active-Active.

2. Also at the NES level - they have some slowness when the number of concurrent jobs exceeds 30. Is there any way to setup - queuing (or) another mechanism to counter this issue ?

Environment

Release : 6.6

Component : CA RELEASE AUTOMATION CORE

Resolution

Please find answer to above queries.


1: Release Automation as a product only support Active-Passive configuration as of now and I am not sure if there is currently plans for active-active NAC setup. However the very reason of UI being slow, I think this belongs to performance category of product which deals more around how many concurrent users, releases etc can be handle by system at specific point of time, which I think customer can report case-by-case basis and we can investigate on the same as I don't believe High Availability of NAC resolve it because slowness may be observed from execution perspective i.e. carried out by Agents.

2: High availability/load-balancing at NES: I don't think as per design we support any sort of load balancing, HA or delegation of jobs at execution server level. I think the slowness they observed is more around complexity of processes customer is having which may require a revisit. As from RA product perspective we have certain benchmarks w.r.t how many EPS (events per second) handle in parallel, parallel execution etc. 

In the part of load-balancing among NESes, there's an algorithm sending agents to seek another NES when the number of agents connected to it exceeds some limit (350 by default, warn-capacity/capacity). This aims to spread the agents through the net of NESes, in some sort it is balancing the load. 

However, with respect to both point above, I think we have already addressed some of the issues around performance improvement in 6.7 and we believe customer moving to this version and in case see performance impact we need to log them as cases so that we can investigate further on same and identify actual root cause. Please find list of fixes already part of GA version 6.7. 

  1. Automation Studio Permission page performance improvement
  2. Agent Management page performance improvement
  3. Java-level deadlock leading to 200 blocked Tomcat AJP threads
  4. Deserialization vulnerability fixed
  5. Performance improvement for user group retrieval
  6. DB Index Improvements [basic_parameter].[parent_id]
  7. ROC action -  "Create Step" - performance is improved for the creation of steps during the creation of the deployment and the pre-deployment stage
  8. General performance improvements


However, if still there is a recurrence observed after upgrading to 6.7, to do a next step toward understanding the cause of their slow down, we might want to check out their actual (not just configured or assumed) topology.