In-Cluster DNS lookup requires a Fully Qualified Domain Name (FQDN) in Windows Clusters
search cancel

In-Cluster DNS lookup requires a Fully Qualified Domain Name (FQDN) in Windows Clusters

book

Article ID: 298736

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

When querying a service from within a pod deployed on a Windows worker node, DNS lookup does not work if only the service name is used.

It only works if a Fully Qualified Domain Name (FQDN) for that service is provided. Below is an example of the lookup failure:
PS C:\> nslookup kubernetes
Server:  kube-dns.kube-system.svc.cluster.local
Address:  10.220.0.2

*** kube-dns.kube-system.svc.cluster.local can't find kubernetes: Non-existent domain

Below is an example of a successful lookup:
PS C:\> nslookup kubernetes.default.svc.cluster.local
Server:  kube-dns.kube-system.svc.cluster.local
Address:  10.220.0.2

Name:    kubernetes.default.svc.cluster.local
Address:  10.220.0.1

This issue occurs because there is no Primary DNS suffix configured on the pod.
ipconfig

Windows IP Configuration
Ethernet adapter vEthernet (92a23e18805188ef54a89a1d6c7caed0d024355140a15137d8e2ef3d8b4013a3_eth0):

   Connection-specific DNS Suffix  . : svc.cluster.local
   Link-local IPv6 Address . . . . . : fe80::38de:50d6:85c4:544c%25
   IPv4 Address. . . . . . . . . . . : 172.46.2.5
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Default Gateway . . . . . . . . . : 172.46.2.1


Environment

Product Version: 1.8
OS: Windows

Resolution

Workaround

To work around this problem, the FQDN of the service should be used when accessing the service from the pods on the same cluster.

For example:
PS C:\> nslookup kubernetes.default.svc.cluster.local
Server:  kube-dns.kube-system.svc.cluster.local
Address:  10.220.0.2

Name:    kubernetes.default.svc.cluster.local
Address:  10.220.0.1

Another way to dynamically inject this setting into the pods per application is to use a postStart lifecycle hook.  Here is an example of how this can be achieved:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-win
spec:
  replicas: 1
  progressDeadlineSeconds: 180
  selector:
    matchLabels:
      app: test-win
  template:
    metadata:
      labels:
        app: test-win
    spec:
      containers:
      - name: server
        image: mcr.microsoft.com/windows/servercore:ltsc2019
        ports:
        - containerPort: 80
        command:
        - powershell.exe
        - -command
        - "<#code used from https://gist.github.com/wagnerandrade/5424431#> ; $$listener = New-Object System.Net.HttpListener ; $$listener.Prefixes.Add('http://*:80/') ; $$listener.Start() ; $$callerCounts = @{} ; Write-Host('Listening at http://*:80/') ; while ($$listener.IsListening) { ;$$context = $$listener.GetContext() ;$$requestUrl = $$context.Request.Url ;$$clientIP = $$context.Request.RemoteEndPoint.Address ;$$response = $$context.Response ;Write-Host '' ;Write-Host('> {0}' -f $$requestUrl) ;  ;$$count = 1 ;$$k=$$callerCounts.Get_Item($$clientIP) ;if ($$k -ne $$null) { $$count += $$k } ;$$callerCounts.Set_Item($$clientIP, $$count) ;$$ip=(Get-NetAdapter | Get-NetIpAddress); $$header='<html><body><H1>Windows Container Web Server</H1>' ;$$callerCountsString='' ;$$callerCounts.Keys | % { $$callerCountsString+='<p>IP {0} callerCount {1} ' -f $$ip[1].IPAddress,$$callerCounts.Item($$_) } ;$$footer='</body></html>' ;$$content='{0}{1}{2}' -f $$header,$$callerCountsString,$$footer ;Write-Output $$content ;$$buffer = [System.Text.Encoding]::UTF8.GetBytes($$content) ;$$response.ContentLength64 = $$buffer.Length ;$$response.OutputStream.Write($$buffer, 0, $$buffer.Length) ;$$response.Close() ;$$responseStatus = $$response.StatusCode ;Write-Host('< {0}' -f $$responseStatus)  } ; "
        lifecycle:
          postStart:
            exec:
              command:
              - powershell.exe
              - -command
              - $arguments = @('nicconfig','call','SetDNSSuffixSearchOrder',"('default.svc.cluster.local','svc.cluster.local','cluster.local')") ;wmic $arguments

      nodeSelector:
        kubernetes.io/os: windows
      tolerations:
      - key: "windows"
        operator: "Equal"
        value: "2019"
        effect: "NoSchedule"

Note: The above example makes use of this lifecycle hook:
lifecycle:
  postStart:
    exec:
      command:
      - powershell.exe
      - -command
      - $arguments = @('nicconfig','call','SetDNSSuffixSearchOrder',"('default.svc.cluster.local','svc.cluster.local','cluster.local')") ;wmic $arguments