Windows Applications Fail Startup with "startup health check never passed" After Upgrading to Tanzu Platform for Cloud Foundry Windows 10.2.9
search cancel

Windows Applications Fail Startup with "startup health check never passed" After Upgrading to Tanzu Platform for Cloud Foundry Windows 10.2.9

book

Article ID: 438877

calendar_today

Updated On:

Products

VMware Tanzu Platform - Cloud Foundry

Issue/Introduction

After upgrading Tanzu Platform for Cloud Foundry Windows to version 10.2.9, or following a Windows stemcell upgrade, applications fail to start.

The following error is observed in the application logs:

 [CELL/0] ERR Failed after xx.xx.xx: startup health check never passed.

[HEALTH/0] [ERR] instance proxy failed to start: Timed out after 2m0s (60 attempts) waiting for startup check to succeed: failed to make TCP connection to <IP>:<PORT>: timed out after 1.00 seconds

This issue typically occurs during a cf push, application restage, or when a stemcell upgrade triggers a container restart.

Environment

  • Product: Tanzu Platform for Cloud Foundry Windows (formerly TAS for Windows)
  • Version: 10.2.9
  • Component: Windows Diego Cell / Garden-Windows
  • Platform: Windows Server Stemcells

Cause

The error indicates that the readiness health check performed by the Diego cell failed to receive a successful response from the application within the configured timeout period.

In version 10.2.9, changes to the Diego cell or underlying networking logic may result in stricter port-probing. If an application takes longer to bind to its port than the default 60-second timeout, or if the platform attempts to probe multiple ports that are not actively listening, the instance is marked as unhealthy and terminated.

Resolution

To resolve this, change the health check type from port (the default) to process and increase the startup timeout. This allows the platform to verify that the Windows process is running without waiting for a specific network response that may be delayed during initialization.

Option 1: Update via Application Manifest

Add or update the following properties in your manifest.yml:

applications:
- name: my-windows-app  
   health-check-type: process  
   timeout: 180  # Increase timeout to 180 seconds
 

Option 2: Update via CF CLI

If you prefer to apply the change to an existing application without a manifest update:

  1. Set the health check type: cf set-health-check APP_NAME process
  2. (Optional) Increase the start timeout during the next push: cf push APP_NAME -t 180

 

Notes:

Review breaking change reference below.

https://techdocs.broadcom.com/us/en/vmware-tanzu/platform/elastic-application-runtime/10-2/eart/breaking-changes.html  under section
"Change: Apps are no longer accessible via the Diego Cell IP and Diego Cell host port by default"

 

If changing the health check type from port (the default) to process is not acceptable. And also increasing timeout to 180 seconds continue to fail. Check for cybersecurity software like Crowdstrike if installed.

Uninstalling Crowdstrike resolve  the health check type of port failure in one instance. 

 

Additional Information

 

Related Articles