Optimization of vRealize Operations Manager generated capacity planning metrics in 6.3
search cancel

Optimization of vRealize Operations Manager generated capacity planning metrics in 6.3

book

Article ID: 337502

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

This KB is applicable to vRealize Operations Manager systems upgraded from an earlier version to 6.3.
 
Starting in vRealize Operations Manager 6.3, capacity analytics metrics for an object instance are being generated on the basis of the policy applied to the object. This may lead to some metrics that were available in earlier versions of vRealize Operations Manager to be not computed and available in 6.3. To turn on these metrics in 6.3, there may be actions required in the policies to enable the metrics.
 
This article documents information required to identify the symptoms and take steps to resolve the situation if required.
 
The metric availability can impact vRealize Operations Manager artifacts like views, reports, dashboards and symptoms.


Resolution

Key definitions

 
Resource container
 
A resource container is a dimension along which capacity analytics can be done on a given object. For instance CPU|Demand, CPU|Allocation, Memory|Demand etc.
 
Capacity Analytics metrics
 
As opposed to metrics that are collected (also called adapter collected or raw metrics) from an external source like vSphere, Capacity analytics metrics are computed and published by vRealize Operations Manager. A majority of these metrics are created and available at a per resource container level. For instance, Recommended Size is a metric that is generated by capacity analytics in vRealize Operations Manager. It is generated at resource container levels and the actual metrics in the product is of the form CPU|Demand|Recommended Size, CPU|Allocation|Recommended Size, Memory|Demand|Recommended Size etc.
 
Badge policy settings for resource containers
 
The Capacity/Time Remaining and Stress badges allow resource containers to be enabled or disabled for the particular badge. Only resource containers enabled for a badge in the policy will show up in the badge analysis pages.
 

Changes in 6.3

 
Pre vRealize Operations Manager 6.3, some vRealize Operations Manager capacity analytics metrics may have been available irrespective of the resource container’s status in the policies. Post vRealize Operations Manager 6.3, only metrics belonging to enabled resource containers in the policy will be available.
 
Example:
 
Consider a host which had a policy where the Network IO|Usage Rate was disabled in all the badge level settings. For illustration, we will pick a capacity analytics metric Recommended Size. Pre vRealize Operations Manager 6.3, the host may have had the metric Network IO|Usage Rate|Recommended Size computed and available even though the associated resource container was disabled in all sections of the policy. Starting in 6.3, this metric will no longer be computed. To enable computation and availability of the metric, the associated resource container Network IO|Usage Rate should be enabled in one or more badge settings in the policy.
 
This change in 6.3 ensures that only the required metrics as determined by the policy are computed and stored. This saves on computation and storage by avoiding spending of compute/storage on metrics that may not be used actively.


List of impacted metrics

 
The following are the resource containers and metric identifiers that may be impacted. Each of the metric identifier is applicable to all of the resource containers that are specified.
 
The list also provides which settings in the policy can be used to enable the container and its metrics.
 
Object typeResource ContainerMetric IdentifierPolicy control to enable
 
 
 
 
 
Hosts
Clusters
Datacenters
vCenter Servers
Custom Datacenters
CPU|Demand
CPU|Allocation
Memory|Demand
Memory|Allocation
Memory|Consumed
Disk Space|Demand
Disk Space|Allocation
Network IO|Data receive rate
Network IO|Data Transmit Rate
Network IO|Usage Rate
Datastore IO|Outstanding IO requests
Datastore IO|Read Rate
Datastore IO|Write rate
Datastore IO|Reads per second
Datastore IO|Writes per second
Average DemandCapacity/Time Remaining or Stress
Computed DemandCapacity/Time Remaining or Stress
Current SizeCapacity/Time Remaining or Stress
Effective DemandCapacity/Time Remaining or Stress
Is OversizedCapacity/Time Remaining or Stress
ProvisionedReclaimable Capacity
Recommended SizeCapacity/Time Remaining or Stress
Stress Free DemandCapacity/Time Remaining or Stress
Total CapacityCapacity/Time Remaining or Stress
Under usedCapacity/Time Remaining or Stress
Usable CapacityCapacity/Time Remaining
 
 
 
 
 
 
Datastores
 
 
 
Datastore IO|Outstanding IO requests
Datastore IO|Read Rate
Datastore IO|Write rate
Datastore IO|Reads per second
Datastore IO|Writes per second
Disk Space|Total
Disk Space|Allocation
Average DemandCapacity/Time Remaining or Stress
Computed DemandCapacity/Time Remaining or Stress
Current SizeCapacity/Time Remaining or Stress
Effective DemandCapacity/Time Remaining or Stress
Is OversizedCapacity/Time Remaining or Stress
ProvisionedReclaimable Capacity
Recommended SizeCapacity/Time Remaining or Stress
Stress Free DemandCapacity/Time Remaining or Stress
Total CapacityCapacity/Time Remaining or Stress
Under usedCapacity/Time Remaining or Stress
Usable CapacityCapacity/Time Remaining
 
 
 
 
 
 
 
Virtual Machine
CPU
Memory
Memory (Consumed)
Disk Space – Total Usage
Network IO|Data receive rate
Network IO|Data Transmit Rate
Network IO|Usage Rate
Network IO (Host)|Data receive rate
Network IO (Host)|Data Transmit Rate
Network IO (Host)|Usage Rate
Datastore IO|Outstanding IO requests
Datastore IO|Read Rate
Datastore IO|Write rate
Datastore IO|Reads per second
Datastore IO|Writes per second
Average DemandCapacity/Time Remaining or Stress
Computed DemandCapacity/Time Remaining or Stress
Current SizeCapacity/Time Remaining or Stress
Effective DemandCapacity/Time Remaining or Stress
Is OversizedCapacity/Time Remaining or Stress
Recommended SizeCapacity/Time Remaining or Stress
Stress Free DemandCapacity/Time Remaining or Stress
Total CapacityCapacity/Time Remaining or Stress
Under usedCapacity/Time Remaining or Stress
Usable CapacityCapacity/Time Remaining
Idle timeReclaimable Capacity


Troubleshooting to determine if a missing metric is due to the optimization or another cause

 
The following example illustrates the symptom that impacted metrics may exhibit and the potential actions that can be taken to restore the metric's state.
 
Symptom
 
One of the views has a metric that shows no data after upgrade to 6.3 The metric data was available prior to upgrade.
 
Before upgrade to 6.3:
 
 
After upgrade to 6.3:
 
As illustrated, the metric Recommended Size for the resource container Network I/O|Usage Rate for hosts was displaying a value before the 6.3 upgrade, but is not displaying data after the upgrade.
 
Verification of the cause
 
To verify that these symptom's root cause, one or more of the steps can be performed:
 
  • The impacted metric can be cross checked with the List of impacted metrics above. In this case, we see that for a host system, the metric Recommended Size for resource container named Network I/O|Usage Rate is listed. Hence, it can be confirmed that the symptom is due to the change in vRealize Operations Manager capacity analytics in 6.3. If the metric missing values is not listed, it is due to other causes not covered here.
  • The policy settings applied to the object can be checked for the state of the resource container associated with the metric. Information on which settings under policy to check (Capacity/Time Remaining or Stress settings) depends on the metric impacted and can be found in the table under List of Impacted Metrics above. In this case, the resource container in question is named Network I/O|Usage Rate and the metric Recommended Size has Capacity/Time remaining or Stress settings as the controls for it.

    The following are illustration of the Capacity/Time Remaining settings and the Stress settings for the Hosts System kind. The highlighted item shows that the resource container Network I/O|Usage Rate is enabled in neither policy setting. This further confirms that the metric is not generating a value in 6.3 because the associated resource container has been disabled in the policy settings that control the metric.

    Capacity/Time Remaining settings:



    Stress settings:

 
Fixing the issue
 
Before attempting the steps to restore the metric, be aware of the following:
 
  • Restoring the metric will require enabling the resource container in at least one of the policy settings. This will result in the resource containers's usage/demand impacting a badge and possibly alerts associated with that badge. Using the above scenario as an example, if we were to enable the Network I/O|Usage Rate resource container in the Stress settings, the capacity analytics performed on the resource container may increase the stress badge score leading to alerts that would not have been present when the resource container was disabled.
  • The larger the number of enabled resource containers, the larger the number of metrics that need to be computed and stored. Be aware that there is cost associated with enabling resource containers that were previously disabled.
 
Overall, in the case of an impacted metric, it is important to ask the question:
 
Is this metric on a particular resource container really important and how is it used in capacity planning exercises?
 
If the resource container is an important part of capacity planning, enabling computation of metrics and badges based on the resource container (enabling it in policy) is the right configuration.
 
To restore the impacted metric, the associated resource container needs to be enabled in at least one of the policy settings control listed in the tables above. In the case illustrated here, the metric Network I/O|Usage Rate|Recommended Size can be restored by checking the Network IO Usage Rate container in either of Capacity/Time Remaining settings or Stress settings and saving the policy.
 
Note: It may take up to 24 hours after the policy settings change for the metric to be computed and a value to be displayed.