Metrics Server job fails with "`parse_vm_extensions': Duplicate vm extension name 'master_managed_identity'" on BOSH Director during Ops Manager upgrade
search cancel

Metrics Server job fails with "`parse_vm_extensions': Duplicate vm extension name 'master_managed_identity'" on BOSH Director during Ops Manager upgrade

book

Article ID: 293841

calendar_today

Updated On:

Products

Operations Manager

Issue/Introduction

Stale cloud configs have the potential to cause failures when upgrading. Leftover cloud configs and conflicts can occur when a tile is removed and sometimes added back.

In this case, two Tanzu Kubernetes Grid Integrated Edition (TKGI) tiles that were no longer deployed caused a duplicate entry in vm_extensions (master_managed_identity).

This caused the metrics-server job to fail when parsing through the cloud configs.
bosh/0:~# tail -n 10 /var/vcap/sys/log/director/metrics_server.stderr.log 
	from /var/vcap/packages/director/bin/bosh-director-metrics-server:29:in `load'
	from /var/vcap/packages/director/bin/bosh-director-metrics-server:29:in `<main>'
/var/vcap/data/packages/director/ef11310572119bee83485acf274bcc0fa0166427/gem_home/ruby/2.6.0/gems/bosh-director-0.0.0/lib/bosh/director/deployment_plan/cloud_manifest_parser.rb:120:in `parse_vm_extensions': Duplicate vm extension name 'master_managed_identity' (Bosh::Director::DeploymentDuplicateVmExtensionName)
	from /var/vcap/data/packages/director/ef11310572119bee83485acf274bcc0fa0166427/gem_home/ruby/2.6.0/gems/bosh-director-0.0.0/lib/bosh/director/deployment_plan/cloud_manifest_parser.rb:16:in `parse'
	from /var/vcap/data/packages/director/ef11310572119bee83485acf274bcc0fa0166427/gem_home/ruby/2.6.0/gems/bosh-director-0.0.0/lib/bosh/director/metrics_collector.rb:131:in `populate_network_metrics'
	from /var/vcap/data/packages/director/ef11310572119bee83485acf274bcc0fa0166427/gem_home/ruby/2.6.0/gems/bosh-director-0.0.0/lib/bosh/director/metrics_collector.rb:105:in `populate_metrics'
	from /var/vcap/data/packages/director/ef11310572119bee83485acf274bcc0fa0166427/gem_home/ruby/2.6.0/gems/bosh-director-0.0.0/lib/bosh/director/metrics_collector.rb:51:in `start'
	from /var/vcap/data/packages/director/ef11310572119bee83485acf274bcc0fa0166427/gem_home/ruby/2.6.0/gems/bosh-director-0.0.0/bin/bosh-director-metrics-server:26:in `<top (required)>'
	from /var/vcap/packages/director/bin/bosh-director-metrics-server:29:in `load'
	from /var/vcap/packages/director/bin/bosh-director-metrics-server:29:in `<main>'

Running the command monit Summary, you see the 'metrics_server' status as Does not exist.
# monit summary
The Monit daemon 5.2.5 uptime: 3d 15h 15m 
 
Process 'nats'                      running
Process 'postgres'                  running
Process 'director'                  running
Process 'worker_1'                  running
Process 'worker_2'                  running
Process 'worker_3'                  running
Process 'worker_4'                  running
Process 'worker_5'                  running
Process 'worker_6'                  running
Process 'worker_7'                  running
Process 'worker_8'                  running
Process 'worker_9'                  running
Process 'worker_10'                 running
Process 'worker_11'                 running
Process 'worker_12'                 running
Process 'worker_13'                 running
Process 'worker_14'                 running
Process 'worker_15'                 running
Process 'worker_16'                 running
Process 'worker_17'                 running
Process 'worker_18'                 running
Process 'worker_19'                 running
Process 'worker_20'                 running
Process 'worker_21'                 running
Process 'worker_22'                 running
Process 'worker_23'                 running
Process 'worker_24'                 running
Process 'worker_25'                 running
Process 'worker_26'                 running
Process 'worker_27'                 running
Process 'worker_28'                 running
Process 'worker_29'                 running
Process 'worker_30'                 running
Process 'director_scheduler'        running
Process 'metrics_server'            Does not exist
Process 'director_sync_dns'         running
Process 'director_nginx'            running
Process 'health_monitor'            running
Process 'uaa'                       running
Process 'credhub'                   running
Process 'system-metrics-agent'      running
Process 'blobstore_nginx'           running
Process 'registry'                  running
System 'system_localhost'           running


Environment

Product Version: 2.10

Resolution

1. Run the bosh configs command to confirm if there are two cloud configs listed for the same deployment.

For example, in the following output you'll see two pivotal-container-service configs with two different IDs.
bosh configs
Using environment '10.34.28.5' as client 'ops_manager'
 
ID    Type     Name                                                             Team                                            Created At  
875*  cloud    default                                                          -                                               2021-08-17 01:24:14 UTC  
74*   cloud    pivotal-container-service-4ba8680e84ae0a71a613                   pivotal-container-service-4ba8680e84ae0a71a613  2019-10-04 18:24:01 UTC  
70*   cloud    pivotal-container-service-b68de0ea40b16c5a2828                   pivotal-container-service-b68de0ea40b16c5a2828  2019-10-01 21:13:33 UTC  
868*  cpi      default                                                          -                                               2021-08-13 22:35:21 UTC  
871*  runtime  cf-4fbccf2af6d498c6cac6-bosh-dns-aliases                         -                                               2021-08-14 14:57:08 UTC  
867*  runtime  director_runtime                                                 -                                               2021-08-13 22:35:20 UTC  
131*  runtime  nessus-agent                                                     -                                               2019-12-10 19:39:19 UTC  
865*  runtime  ops_manager_dns_runtime                                          -                                               2021-08-13 22:35:19 UTC  
866*  runtime  ops_manager_system_metrics_runtime                               -                                               2021-08-13 22:35:19 UTC  
208*  runtime  p-healthwatch-a19eb4b56cca7a096ceb-indicator-registration-agent  -                                               2020-04-15 22:28:53 UTC  
 
(*) Currently active
Only showing active configs. To see older versions use the --recent=10 option.
 
10 configs
 
Succeeded

 

bosh config

To confirm the deployments no longer exist through bosh config, run this command:
bosh config --name pivotal-container-service-4ba8680e84ae0a71a613 --type cloud
Using environment '10.34.28.5' as client 'ops_manager'
 
ID          74  
Type        cloud  
Name        pivotal-container-service-4ba8680e84ae0a71a613  
Created At  2019-10-04 18:24:01 UTC  
Content     vm_extensions:  
            - cloud_properties:  
                managed_identity:  
                  type: UserAssigned  
                  user_assigned_identity_name: pks-master  
              name: master_managed_identity  
            - cloud_properties:  
                managed_identity:  
                  type: UserAssigned  
                  user_assigned_identity_name: pks-worker  
              name: worker_managed_identity  
bosh config --name pivotal-container-service-b68de0ea40b16c5a2828 --type cloud
Using environment '10.34.28.5' as client 'ops_manager'
 
ID          70  
Type        cloud  
Name        pivotal-container-service-b68de0ea40b16c5a2828  
Created At  2019-10-01 21:13:33 UTC  
Content     vm_extensions:  
            - cloud_properties:  
                managed_identity:  
                  type: UserAssigned  
                  user_assigned_identity_name: xyz  
              name: master_managed_identity  
            - cloud_properties:  
                managed_identity:  
                  type: UserAssigned  
                  user_assigned_identity_name: xyz  
              name: worker_managed_identity  
              
 
1 config
 
Succeeded


bosh

To confirm the deployments no longer exist through bosh, run this command:
bosh -d pivotal-container-service-4ba8680e84ae0a71a613 vms
Using environment '10.34.28.5' as client 'ops_manager'
 
Listing deployment 'pivotal-container-service-4ba8680e84ae0a71a613' vms infos:
  Director responded with non-successful status code '404' response '{"code":70000,"description":"Deployment 'pivotal-container-service-4ba8680e84ae0a71a613' doesn't exist"}'
 
Exit code 1
bosh -d pivotal-container-service-b68de0ea40b16c5a2828 vms
Using environment '10.34.28.5' as client 'ops_manager'
 
Listing deployment 'pivotal-container-service-b68de0ea40b16c5a2828' vms infos:
  Director responded with non-successful status code '404' response '{"code":70000,"description":"Deployment 'pivotal-container-service-b68de0ea40b16c5a2828' doesn't exist"}'
 
Exit code 1


2. To resolve this issue, delete the configs that are no longer in use. However, before deleting it is always best practice to take a backup.
bosh config --name pivotal-container-service-xxxxx --type cloud > backup_pivotal-container-service-xxxxx

3. After completing the backup, delete the configs and run another Apply Change on the Director.
bosh delete-config --name pivotal-container-service-4ba8680e84ae0a71a613 --type cloud 
Using environment '10.34.28.5' as client 'ops_manager'
 
Continue? [yN]: y
 
Succeeded
bosh delete-config --name pivotal-container-service-b68de0ea40b16c5a2828 --type cloud
Using environment '10.34.28.5' as client 'ops_manager'
 
Continue? [yN]: y
 
Succeeded