Operations Manager slows down as Ruby Bundle Process is Spiking Memory and CPU
search cancel

Operations Manager slows down as Ruby Bundle Process is Spiking Memory and CPU

book

Article ID: 293520

calendar_today

Updated On:

Products

Operations Manager

Issue/Introduction

Symptoms:

Operations Manager (Ops Manager) is slow loading pages and eventually crashes.

Operation Manager has process 'bundle' using 100% CPU and slowly increasing up to 100% memory usage.

The following process is increasing in memory:

tempest+ 1463 11.7 76.3 6797856 6244960 ? Sl 16:14 36:19 /home/tempest-web/tempest/web/vendor/bundle/ruby/2.3.0/bin/thin -C config/thin.production.yml start
ubuntu 5744 0.0 0.0 10480 2240 pts/4 S+ 21:23 0:00 grep 1463

Environment


Cause

This issue is caused by bad IP range reservation specified under Networks tab in Operations Manager Director. Bad ranges can cause Operations Manager to iterate forever over that range and spike the CPU and memory utilization of Ruby bundle process. This leads to degraded performance and eventually crashes of Operations Manager.

Resolution

The IP reservations set in Operations Manager network settings need to be reviewed for problems. 

You can do this in GUI in Operations Manager Director, go to Networks tab.

Note: You may need to perform `service tempest-web restart` on Operations Manager first to restart process).

Alternatively, you can verify network settings in installation.yml by decrypting it from this location:

sudo -u tempest-web RAILS_ENV=production /home/tempest-web/tempest/web/scripts/decrypt /var/tempest/workspaces/default/installation.yml /tmp/installation.yml 

vim /tmp/installation.yml

Now you need to verify your network IP ranges and reservations for bad settings.


Examples of bad network settings

IP reservation range goes backward (from high number to low): 10.1.1.250-10.1.1.2

IP reservation has mismatching subnets: 10.72.94.141-10.72.82.255 
                                                             110.X.X.X-10.X.X.X
Network has very large CIDR block: 10.72.0.0/5
---

Once the bad network setting is identified and addressed, the performance should immediately be fixed upon saving changes in Ops Manager.