To confirm the external_id parameter is missing:
1. Collect the list of Container Cluster(s) from NSX-T Manager:# curl -k -u admin -H "Content-Type: application/json" -X GET https://localhost/api/v1/fabric/container-clusters/
Enter host password for user 'admin':
{
"results" : [ {
"external_id" : "c9ef8d33-xxxx-xxxx-xxxx-6050028bfacf",
"cluster_type" : "PAS",
"infrastructure" : {
"infra_type" : "vSphere"
},
"origin_properties" : [ ],
"resource_type" : "ContainerCluster",
"display_name" : "TEST",
"_last_sync_time" : 1580396792638
} ],
"result_count" : 1,
"sort_by" : "display_name",
"sort_ascending" : true
}
2. Verify the external_id parameter is missing for the collected Container Cluster external_id:curl -k -u admin -H "Content-Type: application/json" -X GET https://localhost/api/v1/fabric/container-applications?container_cluster_id=c9ef8d33-xxxx-xxxx-xxxx-050028bfacf
Enter host password for user 'admin':
- notice here that the external_id parameter is missing in the response:
{
"results" : [ {
"container_cluster_id" : "c9ef8d33-xxxx-xxxx-xxxx-6050028bfacf",
"container_project_id" : "b04095af-xxxx-xxxx-xxxx-f86eb561a81c",
"origin_properties" : [ ],
"resource_type" : "ContainerApplication",
"_last_sync_time" : 0
}
Symptoms:
Symptoms will include all of the following:
NCP Log will contain events similar to the following in ./ncp/ncp.stdout.log:
./ncp/ncp.stdout.log.1:1 2020-02-09T21:41:24.287Z 7d6c57b9-xxxx-xxxx-xxxx-c57c63241782 NSX 16223 - [nsx@6876 comp=“nsx-container-ncp” subcomp=“ncp” level=“CRITICAL”] nsx_ujo.ncp.main Failed to initialize container orchestrator adaptor: ‘external_id’ ./ncp/ncp.stdout.log.1:1 2020-02-09T21:42:26.037Z 7d6c57b9-xxxx-xxxx-xxxx-c57c63241782 NSX 16447 - [nsx@6876 comp=“nsx-container-ncp” subcomp=“ncp” level=“CRITICAL”] nsx_ujo.ncp.main Failed to initialize container orchestrator adaptor: ‘external_id’ ./ncp/ncp.stdout.log.1:1 2020-02-09T21:43:27.073Z 7d6c57b9-xxxx-xxxx-xxxx-c57c63241782 NSX 16673 - [nsx@6876 comp=“nsx-container-ncp” subcomp=“ncp” level=“CRITICAL”] nsx_ujo.ncp.main Failed to initialize container orchestrator adaptor: ‘external_id’
This issue is resolved if running a fresh install of NSX-T 2.5.1.
If upgrading from NSX-T 2.5.0 to NSX-T 2.5.1 the below steps need to be performed as well.
After the update to NSX-T 2.5.1
-Stop all NCP instances.
Option (1) Remove only those container_cluster entries with invalid container-application entries.
Identify the impacted cluster id's with the following API calls.
(a) GET https://<NSX Manager IP>/api/v1/fabric/container-clusters/
For each item in the result of (a), do
(b) GET https://<NSX Manager IP>/api/v1/fabric/container-applications?container_cluster_id=<container_cluster_id>
(c) Check if any container-application entry without external_id in the result of (b)
If so remove the cluster id entries that are missing the external id field with the following API call.
(a) DELETE https://<NSX Manager IP>/ api/v1/fabric/container-clusters/<container_cluster_id>
Option (2) Remove all container_cluster entries.
Identify all the cluster id's with the following API calls.
(a) GET https://<NSX Manager IP>/api/v1/fabric/container-clusters/
For each item in the result of (a), use the following API call to remove.
(b) DELETE https://<NSX Manager IP>/api/v1/fabric/container-clusters/<container_cluster_id>
- Start all NCP instances
At this time if the steps in the workaround section have been applied. Those changes to the NCP.ini can be undone.
Workaround:
This workaround removes the “external_id” requirement by disabling the inventory feature.
(1) login to each diego_database, for example (from OpsMgr)
bosh ssh -d <deployment-id> diego-database/xxxxxxxx
(2) sudo su
(3) cd /var/vcap/data/jobs/ncp/yyyyyyyy/config/
where yyyyyyyy
should be the latest deployment ID
(4) vi ncp.ini
under [nsx_v3] section, add
enable_inventory = False
<-- as shown, with a captial 'F' in False
(5) restart NCP
# monit restart ncp