Missing Resource Pool in vSphere results in CPI or service instance creation errors
search cancel

Missing Resource Pool in vSphere results in CPI or service instance creation errors

book

Article ID: 373789

calendar_today

Updated On:

Products

VMware Tanzu Application Service

Issue/Introduction

If a Resource Pool (RP) is defined in the Tanzu Platform BOSH Director file as part of an availability zone, and that RP is deleted from vSphere, it can cause a host of problems. These problems manifest when the platform tries to create or recreate VMs and can lead to multiple VMs displaying a "failing" state.

These symptoms can occur when attempting to recreate a VM using bosh. The vSphere CPI can throw an error indicating that a Resource Pool is missing. The Resource Pool name that BOSH expects to find is defined in the BOSH tile, Create Availability Zones tab.

Alternately, if a service instance plan (example below is for Redis) uses a certain Availability Zone (AZ) but the Resource Pool is missing for that AZ, it may throw and error like this:

org.springframework.data.redis.RedisConnectionFailureException: Cannot get Jedis connection; nested exception is redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool

Environment

vSphere (all versions)

VMware Tanzu Application Service for VMs (all versions)

Cause

Most often the deletion of a Resource Pool happens by operator error. However, when ESXi hosts crash, we have see a Resource Pool removed from TAS configurations.

Resolution

To restore the platform to normal operations, one must recreate the Resource Pool in vSphere.

Find the cluster that represents the availability zone that is affected. Right-click on the cluster and choose "New Resource Pool".

Get the settings used by the other resource pools and record them in a note or a screenshot.

Name the Resource Pool as defined in the BOSH Director tile, "Create Availability Zones" tab.

Once the creation of the new (restored) Resource Pool is completed, bosh operations that interact with that AZ should work again as expected.