App deployments and restages fail after NCP upgrade to v4.1.1.5 in a TAS environment where NCP is utilizing the NSX-T Policy API and the configuration of NCP is no_snat
Cell xxxx-xxxx-xxxx failed to create container for instance xxxx-xxxx-xxxx: external networker encountered an error running 'up' action: exit status 1
NCP log in Diego Database show an error similar to:
NSX 127835 - [nsx@6876 comp="nsx-container-ncp" subcomp="ncp" level="ERROR" errorCode="NCP00251"] nsx_ujo.ncp.main Failed to initialize container orchestrator adaptor: Invalid configuration were found: No ip addresses to create domain group dg-CLUSTERNAME for domain CLUSTERNAME
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/nsx_ujo/ncp/main.py", line 299, in start_ncp
coe.initialize()
File "/usr/local/lib/python3.8/dist-packages/nsx_ujo/ncp/pcf/adaptor.py", line 173, in initialize
self._initialize_services()
NCP job in Diego database is crashed and can't be restarted.
TAS - Tanzu Platform for Cloud Foundry
NCP tile v4.1.1.5 working with NSX-T Policy API mode & no_snat configuration
Workaround: edit the file /var/vcap/jobs/ncp/config/ncp.ini in each of the the diego_database instances. Modify "container_ip_blocks" value and enter the UUID of the IP block instead of the name currently in use. Then restart NCP job using "monit restart" command.
As a permanent fix, modify the configuration of the NCP tile in Opsmanager.
- In NCP pane "IP Blocks of Container Networks" delete the current IP block name, then add the IP block UUID. The value of this entry will be the ID (or IDs) of the IP blocks used in the TAS tile which can be retrieved from NSX UI or API.
- Run Apply changes for the NCP tile.