SCDF Task Pods Not Auto-Deleting in Kubernetes
search cancel

SCDF Task Pods Not Auto-Deleting in Kubernetes

book

Article ID: 431547

calendar_today

Updated On:

Products

VMware Tanzu Spring Essentials

Issue/Introduction

In Kubernetes deployments of Spring Cloud Data Flow (SCDF), manually launched tasks and composed tasks may result in Pods continuously accumulating in the cluster. While scheduled tasks may behave as expected and retain only a limited number of completed executions, directly launched tasks do not automatically clean up after completion.

Cause

The accumulation of Pods occurs because SCDF is creating bare Pods instead of Kubernetes Jobs. Without enabling Job creation at the server level, the TTL-based cleanup mechanism cannot be applied. As a result, completed task Pods remain in the cluster.

Resolution

To enable automatic cleanup, SCDF must be configured to create Kubernetes Jobs instead of bare Pods. This is done by setting the following environment variables on the SCDF server:

 
SPRING_CLOUD_DATAFLOW_TASK_PLATFORM_KUBERNETES_ACCOUNTS_DEFAULT_CREATE_JOB=true
SPRING_CLOUD_DATAFLOW_TASK_PLATFORM_KUBERNETES_ACCOUNTS_DEFAULT_TTL_SECONDS_AFTER_FINISHED=60
 

Replace DEFAULT with your specific Kubernetes account name if you are using a custom task platform account.

After applying these settings and restarting the SCDF server, manually launched and composed tasks will create Kubernetes Job resources. Once the task completes, the configured ttlSecondsAfterFinished value will trigger automatic deletion of the Job and its associated Pod after the specified number of seconds.

Please note that enabling this configuration does not retroactively clean up Pods that have already completed. Existing completed Pods must be removed manually using the following commands:

 
kubectl delete pods -n <namespace> --field-selector=status.phase==Succeeded
kubectl delete pods -n <namespace> --field-selector=status.phase==Failed
 

Replace <namespace> with the appropriate Kubernetes namespace where SCDF tasks are running.