Operations (Ops) Manager Apply Changes fails after proxy settings are removed. The Changelog shows the following:
Updating instance 'bosh/0'... Finished (00:01:17) Waiting for instance 'bosh/0' to be running... Failed (00:05:04) Failed deploying (00:13:54) Stopping registry... Finished (00:00:00) Cleaning up rendered CPI jobs... Finished (00:00:00) Deploying: Received non-running job state: 'starting' Exit code 1
When investigated further, it would be found that the Bosh Director UAA is actually failing to start with the following error:
/var/vcap/jobs/uaa/bin/configure_proxy: line 13: proxy_conf[1]: unbound variable
Newer versions of UAA bring the addition of strict usage with environment variables. If the UAA binaries try to use an environment variable that doesn't have a proper value, it will fail with an unbound variable exception.
During recreation of the BOSH Director in an apply changes - the Director's UAA job utilizes values from the proxy_settings
table in the tempest_production
database located within OpsManger.
The proxy_settings
table includes three columns of interest: http_proxy, https_proxy, no_proxy.
Initially the values in these columns are null. If the proxy settings are configured in any way in OpsManager - then the new values will be written to this table, as expected. However if these values are removed, then blank strings will be written as values instead of null. Thus UAA will continue to believe it has proxy information to process. When UAA thinks it has proxy information - it will trigger some action inside of the configure_proxy binary. This specific action is:
export HTTP_PROXY='' export http_proxy='' proxy_conf=(`echo $HTTP_PROXY | tr ":" " " | tr "\/" " "`) HTTP_PROXY_JAVA_OPTIONS="$HTTP_PROXY_JAVA_OPTIONS -Dhttp.proxyHost=${proxy_conf[1]} -Dhttp.proxyPort=${proxy_conf[2]} "
This causes the the BOSH Director UAA to fail during the deployment (Apply Changes) when trying it tries to utilize proxy_conf[1] because of the new change requiring strict usage of environment variables.
This is especially important because it is entirely possible for exported or imported Ops Manager settings over the course of the minor versions through upgrades. For example a proxy change in Ops Manager 2.3.x could have carried blank strings all the way up to 2.7.x if the Ops Manager settings have been exported or imported each time. The new UAA versions are now requiring strict usage of environment variables.
This issue is seen in Ops Manager v2.7.x. A bug report has been submitted and a tracker created.
The workaround is to manually delete or update the particular row in the proxy_settings
table in the tempest_production
database.
tempest_production
database.
$ sudo -u postgres psql -d tempest_production
proxy_settings
table.
select * from proxy_settings;
tempest_production=# select * from proxy_settings; id | http_proxy | https_proxy | no_proxy | created_at | updated_at ----+------------+-------------+----------+----------------------------+--------------------------- 1 | | | | 2020-01-08 15:38:22.158225 | 2020-01-08 15:38:24.55358 (1 row)
http_proxy
, https_proxy
and no_proxy
columns.SELECT * FROM proxy_settings WHERE https_proxy IS NULL AND http_proxy IS NULL AND no_proxy IS NULL;If the query returns 0 rows then this means that the values are not null.
tempest_production=# SELECT * FROM proxy_settings WHERE https_proxy IS NULL AND http_proxy IS NULL AND no_proxy IS NULL; id | http_proxy | https_proxy | no_proxy | created_at | updated_at ----+------------+-------------+----------+------------+------------ (0 rows)
http_proxy
, https_proxy
and no_proxy
columns, then delete the row by executing the following query:
delete from proxy_settings;
UPDATE tablename SET column1 = NULL, column2 = NULL WHERE id IN (1);For example, if you want to put the three columns all back to null you would run:
UPDATE proxy_settings SET http_proxy = NULL, https_proxy = NULL, no_proxy = NULL WHERE id IN (1);