VMware VeloCloud SD-WAN DNS cache aging
search cancel

VMware VeloCloud SD-WAN DNS cache aging

book

Article ID: 388606

calendar_today

Updated On:

Products

VMware VeloCloud SD-WAN

Issue/Introduction

When DNS response goes through SD-WAN edge, it intercepts the DNS A/AAAA records in the DNS response and cache them into edge's DNS cache. Additionally, edge also learns hostname-IP mapping through DPI and insert them into DNS cache. Thus DPI entry and DNS cache share the DNS cache space. DNS cache limit is hardcoded based on the device's memory. Sometimes customer may find "DNS cache max limit" event and may want to know how DNS cache is aging in the cache space. This article introduces how it works.

Environment

VMware VeloCloud SD-WAN edge

Cause

When DNS response goes through SD-WAN edge, it intercepts the DNS A/AAAA records in the DNS response and cache them into edge's DNS cache. Edge also honor the TTL in the DNS response:

 

edge:b2-edge1:~# debug.py --dns_name_cache


Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2     300     DNS

 

Once the cache entry is inserted, TTL decreases by 1 per second. When TTL reaches 0, SD-WAN edge does not immediately delete the entry, but keep decreasing TTL to -1, -2 etc:

Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -1     DNS
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -2     DNS
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -4     DNS
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -5     DNS
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -6     DNS
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -8     DNS

Then how SD-WAN edge clean the cache entry? There is a periodic DNS cache cleanup timer with default value 600s, once timer expires, SD-WAN edge cleans all the cache entries with negative TTL. Thus actual survival time of a cache entry is TTL+random(0-599). For example, when a cache entry is inserted with initial TTL=300s, the actual survival time of this cache entry is 300-899 seconds. 

Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -584     DNS


Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -585     DNS

Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -586     DNS


Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -588     DNS <----cleanup timer expires at this time

Total Cache Entries: 0 <----cache entry with negative TTL is deleted
NAME   ADDRESS  TTL(s)  SOURCE

 

Below is a brief summary:

1. When a cache entry's TTL becomes negative value, it still works for hostname-based business policy.

2. When a SD-WAN edge learns same hostname-IP mapping, edge refreshes the TTL immediately, no matter the previous cache entry's TTL was positive or negative value:

Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -174     DNS
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -176     DNS
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -177     DNS
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2    -178     DNS
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2     298     DNS <----Refresh TTL
Total Cache Entries: 1
NAME                    ADDRESS  TTL(s)  SOURCE
.www.zhongyu.com   172.16.100.2     297     DNS

3. When a hostname is mapping to multiple IPs, when the edge learns same hostname-IP mapping, it only refresh the TTL of that specific IP. 

4. DPI learned cache entry's TTL is 86400s by default. Aging process is same with DNS sourced entries.

 

 

 

Resolution

Customer can manually flush the DNS cache by command:

debug.py --dns_ip_cache_flush

 

Or do it via remote diagnostics:

VMware VeloCloud SD-WAN has enhanced DNS cache management since R452-20240125-GA, tracked by bug#126520. When the DNS cache is full, the Edge will now reclaim the least recently used (LRU) entry, as long as the entry hasn't been used in the last 5 minutes, to allow room for the new incoming entry. This enhancement increases the success rate of hostname-based business policies.