Deploy-all errand of credhub-service-broker deployment fails on 'operation: bind"
search cancel

Deploy-all errand of credhub-service-broker deployment fails on 'operation: bind"

book

Article ID: 436186

calendar_today

Updated On:

Products

VMware Tanzu Platform - Cloud Foundry

Issue/Introduction

  • During Apply Changes on the Credhub tile, the deploy-all errand fails.
  • Error messages in Opsmanager change log show events like:

    Job (########-####-####-####-a037f88c52b3) failed: bind could not be completed: Service broker error: There was a problem completing your request. Please contact your operations team providing the following information: service: p-cloudcache, service-instance-guid: ########-####-####-####-ccd72d8d31a9, broker-request-id: ########-####-####-####-c655b7c4cb4b, operation: bind  
                 FAILED  
  • When checking the associated BOSH task, the following failure appears for the deploy-all errand:

    Instance   deploy-all/########-####-####-####-bbb8a5f92ff8  
    Exit Code  1  
    Stdout     cf version 8.16.0+4b92b73.2025-09-18  
              cf api api.system.<DOMAIN>.com  
              cf auth  
              cf target -o credhub-service-broker-org  
              cf target -s credhub-service-broker-space  
              cf push --redact-env credhub-service-broker -f /var/vcap/jobs/deploy-all/config/manifest.yml -s cflinuxfs4 --no-start  
              cf push --no-route -b binary_buildpack -p /tmp/setup -u process setup -c sleep infinity  
              Current State: FAILED  
              cf logs --recent setup  
                
    Stderr     Using cflinuxfs4 stack  
  • The above task runs on the deploy-all instance VM performs a cf push against a setup manifest. The setup app runs a task which executes a seed.sh script which issues credhub generate commands to create Credhub users for the brokered instances. 
  • Check the output of the seed.sh script issued by the setup task in the credhub-service-broker-org and credhub-service-broker-space with command:

    # cf tasks setup.

    Results will show 413 errors like:
     
    2026-04-07T12:13:32.50+0530 [APP/TASK/#####901/0] ERR curl: (22) The requested URL returned error: 413 
    2026-04-07T12:13:32.51+0530 [APP/TASK/#####901/0] OUT {"scope":["uaa.none"],"client_id":"credhub-service-broker","resource_ids":["none"],"authorized_grant_types":["client_credentials"],"autoapprove":[],"authorities":["credhub.write","credhub.read"],"lastModified":1568795472000,"required_user_groups":[]} {"status":"ok","message":"secret updated"} Exit status 22
  • On the credhub VM, /var/vcap/sys/log/credhub/credhub.log will show errors like:

    2026-04-07T05:26:26.405Z [https-jsse-nio-8844-exec-5] .... WARN — SqlExceptionHelper: SQL Error: 1452, SQLState: 23000 
    2026-04-07T05:26:26.405Z [https-jsse-nio-8844-exec-5] .... ERROR — SqlExceptionHelper: (conn=765380) Cannot add or update a child row: a foreign key constraint fails (`credhub`.`encrypted_value`, CONSTRAINT `encryption_key_uuid_fkey` FOREIGN KEY (`encryption_key_uuid`) REFERENCES `encryption_key_canary` (`uuid`) ON DELETE RESTRICT ON UPDATE RESTRICT)
    2026-04-07T05:26:26.411Z [https-jsse-nio-8844-exec-5] .... ERROR — ExceptionHandlers: Value exceeds the maximum size.
  • MySQL logs on /var/vcap/sys/log/pxc-mysql/mysql.err.log will show errors like:

    2026-04-08T13:14:28.725702Z 9 [ERROR] [MY-010584] [Repl] Replica SQL: Could not execute Write_rows event on table credhub.encrypted_value; Cannot add or update a child row: a foreign key constraint fails (credhub.encrypted_value, CONSTRAINT encryption_key_uuid_fkey FOREIGN KEY (encryption_key_uuid) REFERENCES encryption_key_canary (`uuid`) ON DELETE RESTRICT ON UPDATE RESTRICT), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's source log FIRST, end_log_pos 0, Error_code: MY-001452
  • If you encounter this on components other than Credhub, see Error: "Cannot add or update a child row: a foreign key constraint fails" after TAS/EAR tile upgrade for details.

Environment

This problem was observed in Elastic Application Runtime (TAS) 10.2.7 with Credhub tile 1.6.7. This problem may appear in any components dependent on the EAR/TAS MySQL database specifically when configured with a 3 node cluster.

 

NOTE: This problem has also been observed in Cloud Controller components as well as Credhub.

Cause

The TAS/EAR 3-node mysql cluster is out of sync, even though mysql-diag says they are synced. From a cluster perspective, Synced means the node was a healthy Galera member applying the cluster transactions. It does not necessarily assert per-table consistency or per-ForeignKey equality across nodes. Inconsistency on a single node in the cluster in the ForeignKey value between tables on which the constraint is applied leads to the errors noted in the Issue/Introduction.  Because of this, an INSERT one one node succeeds, but on replication, one of the other members failed to apply it parent table row does not exist.

 

A manual INSERT request from the MySQL node will force a Primary vote, which exposes the problem node and moves it into an Inconsistent STATE with CLUSTER STATUS Disconnected. The problem node will not automatically restart or rejoin the cluster if a manual INSERT query is sent.

 

The cause of this inconsistent state is under investigation.

 

Resolution

Reference Manually forcing a MySQL node to rejoin the HA cluster for details and command syntax on the below steps

 

Identify:

Run mysql-diag from and SSH to the mysql_monitor instance to find the node currently serving traffic through the proxy.

  • The mysql-diag output will report in yellow the current node traffic is proxied to like this:
NOTE: Proxies will currently attempt to direct traffic to "mysql/########-####-####-####-1088cab2fc02"



Isolate:

bosh ssh into the node to which traffic is being proxied and run:

  • # sudo monit stop galera-init


Verify:

Check if the "foreign key constraint" errors stop and other 2 nodes build up a healthy cluster (use mysql-diag again from the mysql_monitor VM to ensure healthy cluster). If these conditions are met, the node on which galera-init was stopped is the problem node.



Reset:

If the errors stopped and the other 2 nodes build a healthy cluster, back up the local data on the problem node to force a fresh sync:

  • # sudo mv /var/vcap/store/pxc-mysql /var/vcap/store/pxc-mysql-backup

    NOTE: Do not perform this step unless you have ensured the MySQL cluster is running with 2 healthy nodes after stopping galera-init and confirmed the foreign key constraint errors have stopped.


Restore:

Restart the service:

  • # sudo monit start galera-init


Confirm:

  • Run mysql-diag to ensure all 3 nodes return to a Synced state.
  • Check if "foreign key constraint" errors do not repeat.

 

 

Gather RCA Data:

To assist with investigation, if you encounter the ForeignKey Constraint failure on TAS/EAR 10.2.x, please gather the following from all 3 MySQL nodes, create a Support Request with the Tanzu by Broadcom team, then upload the logs and MySQL query outputs to the SR:

  1. Gather the my.cnf on each node from:  /var/vcap/jobs/pxc-mysql/config/my.cnf 

    # sudo cp /var/vcap/jobs/pxc-mysql/config/my.cnf /var/vcap/sys/log/my.cnf.bak

  2. Show MySQL Variables from MySQL VM SSH:
    • SHOW GLOBAL VARIABLES SQL output

      # sudo mysql --defaults-file=/var/vcap/jobs/pxc-mysql/config/mylogin.cnf -e "SHOW GLOBAL VARIABLES" > /var/vcap/sys/log/global_variables.txt

    • SHOW ENGINE INNODB STATUS SQL output:

      # sudo mysql --defaults-file=/var/vcap/jobs/pxc-mysql/config/mylogin.cnf -e "SHOW ENGINE INNODB STATUS" > /var/vcap/sys/log/global_engine_innodb_status.txt

  3. The logs from Steps 1 and 2 will be wrapped into the MySQL VMs log bundle gathered from bosh director with the below command:

    # bosh logs -d <TAS_DEPLOYMENT_ID> mysql

  4. Gather Binlogs and affected tables from an SSH to the FAILED node:
    • If gathering this after running the corrective actions above, the $DATADIR noted below will be: /var/vcap/store/pxc-mysql-backup 
    • $DATADIR/GRA_*.log (binlogs of failed Galera transactions)
    • $DATADIR/*/*.ibd (affected tables only, on a failed node, if possible)