VIO Remove stale service records from old pods
search cancel

VIO Remove stale service records from old pods

book

Article ID: 321793

calendar_today

Updated On:

Products

VMware VMware Integrated OpenStack

Issue/Introduction

Symptoms:
  • In some scenarios like scaling out, the pod of service is deleted and new pods are created. So the service records from old pods become stale.

[root@vioadmin1-vioshim-79984c8557-fmtdg /]# openstack compute service list
+----+------------------+-----------------------------------+------------+---------+-------+----------------------------+
| ID | Binary           | Host                              | Zone       | Status  | State | Updated At                 |
+----+------------------+-----------------------------------+------------+---------+-------+----------------------------+
|  2 | nova-consoleauth | nova-consoleauth-67b45d6797-7xz2p | internal   | enabled | down  | 2020-01-10T01:27:17.000000 |
|  3 | nova-conductor   | nova-conductor-868f8946fc-7mj4d   | internal   | enabled | down  | 2020-01-10T01:27:16.000000 |
|  6 | nova-scheduler   | nova-scheduler-6d768857fc-znvrv   | internal   | enabled | down  | 2020-01-10T01:27:17.000000 |
| 10 | nova-compute     | compute-0c6d3d92-c51              | nova       | enabled | up    | 2020-01-10T03:08:02.000000 |
| 11 | nova-compute     | compute-82f9a047-c381682          | nova-1     | enabled | up    | 2020-01-10T03:07:58.000000 |
| 20 | nova-scheduler   | nova-scheduler-6d768857fc-w2hbh   | internal   | enabled | up    | 2020-01-10T03:07:59.000000 |
| 28 | nova-consoleauth | nova-consoleauth-67b45d6797-xnp8l | internal   | enabled | up    | 2020-01-10T03:08:01.000000 |
| 29 | nova-conductor   | nova-conductor-868f8946fc-vcjnz   | internal   | enabled | up    | 2020-01-10T03:07:58.000000 |
| 37 | nova-compute     | compute-82f9a047-c381684          | nova-sriov | enabled | up    | 2020-01-10T03:07:58.000000 |
+----+------------------+-----------------------------------+------------+---------+-------+----------------------------+

  • nova scheduler is down
nova-ospi logs also reports service down, for example,
nova-osapi/0.log
2020-01-23T13:02:05Z 2020-01-23 13:02:05.429 18 DEBUG nova.servicegroup.drivers.db [req-b383f119-952d-4927-a5e7-36963063502f 5eb3ec71a78a43578502c34b92e2f7bb 431e8499f53444689b10190789682fbd - default default] Seems service nova-scheduler on host nova-scheduler-78bc5d98d6-8xwkk is down. Last heartbeat was 2019-11-28 11:13:32. Elapsed time is 4844913.42971 is_up /usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py:80


Environment

VMware Integrated Openstack 7.x

Resolution

Setup a cron job that will be run hourly to check if stale services exists for nova.
  1. ssh into the manager node.
  2. Get the list of resources.
viocli get nova

root@oms [ /var/log ]#  viocli get nova
NAME    CREATION DATE         VALIDATION
nova1   2020-02-20 18:31:05   Success
  1. Update the resource with name from above and set the manifest parameter "cron_job_service_cleaner" to true:
viocli update nova nova1
 
conf:
  nova:
    neutron:
      metadata_proxy_shared_secret: .Secret:managedencryptedpasswords:data.metadata_proxy_shared_secret
    vmware:
      passthrough: "false"
      tenant_vdc: "false"
manifests: <-----
  cron_job_service_cleaner: true <-----
  1. Save the file. 
Note:  After approximately one hour, the cronjob removes all services that were in a "down" state.

Additional Information

Impact/Risks:
Note: This should not impact functionality.