Unable to Schedule GemFire Server and Locator Pods on Specific Worker Nodes in Kubernetes
search cancel

Unable to Schedule GemFire Server and Locator Pods on Specific Worker Nodes in Kubernetes

book

Article ID: 404902

calendar_today

Updated On:

Products

VMware Tanzu Data Suite

Issue/Introduction

The user is deploying VMware GemFire (v10.X) on an Kubernetes environment.
Attempts to schedule server and locator pods on specific worker nodes using node affinity and tolerations in the GemFireCluster spec have failed. 

Environment

VMware Tanzu Gemfire on Kubernetes

Cause

In the original configuration, the GemFireCluster spec used the template field under locators and servers to define pod specifications. However, affinity and tolerations are only supported when specified inside the overrides.statefulSet.spec.template.spec path for both locators and servers.

Resolution

To apply node affinity and tolerations correctly in a GemFireCluster deployment, use the overrides.statefulSet block. Below is a working configuration that ensures pods are scheduled only on nodes labeled with workload-type=gemfire and that tolerate the required taints.

apiVersion: gemfire.vmware.com/v1
kind: GemFireCluster
metadata:
  name: gemfire-cluster-sample
  namespace: test-prod-gemfire
spec:
  image: <registry>/pivotal-gemfire/vmware-gemfire:10.1.2

  locators:
    replicas: 3
    overrides:
      statefulSet:
        spec:
          template:
            spec:
              terminationGracePeriodSeconds: 120
              affinity:
                nodeAffinity:
                  requiredDuringSchedulingIgnoredDuringExecution:
                    nodeSelectorTerms:
                      - matchExpressions:
                          - key: node-role.kubernetes.io/worker
                            operator: Exists
                          - key: workload-type
                            operator: In
                            values: ["gemfire"]
              tolerations:
                - key: "workload-type"
                  operator: "Equal"
                  value: "gemfire"
                  effect: "NoSchedule"
                - key: "node.kubernetes.io/disk-pressure"
                  operator: "Exists"
                  effect: "NoExecute"
                  tolerationSeconds: 300
              containers: []  # Must not be null

  servers:
    replicas: 3
    overrides:
      statefulSet:
        spec:
          template:
            spec:
              terminationGracePeriodSeconds: 120
              affinity:
                nodeAffinity:
                  requiredDuringSchedulingIgnoredDuringExecution:
                    nodeSelectorTerms:
                      - matchExpressions:
                          - key: node-role.kubernetes.io/worker
                            operator: Exists
                          - key: workload-type
                            operator: In
                            values: ["gemfire"]
              tolerations:
                - key: "workload-type"
                  operator: "Equal"
                  value: "gemfire"
                  effect: "NoSchedule"
                - key: "node.kubernetes.io/disk-pressure"
                  operator: "Exists"
                  effect: "NoExecute"
                  tolerationSeconds: 300
              containers: []  # Must not be null