When attempting to resolve a fully qualified domain name (FQDN) from within an app container, it fails with the error message:
connection timed out; no servers could be reached
The DNS resolution succeeds on first attempt but fails on subsequent tries because of a problem with the cache handler.
The problematic FQDN maps to numerous IP addresses (20+) such that response payload is greater than 512 bytes.
For example:
$ nslookup fqdn.example.com name: fqdn.example.com address: 10.###.###.## name: fqdn.example.com address: 10.###.###.##
... name: fqdn.example.com address: 10.###.###.##
$ nslookup fqdn.example.com connection timed out; no servers could be reached
Product Version: 2.5
This is a known issue with bosh-dns. The issue is that bosh-dns cache handler does not handle TC/EDNS properly.
The recommended resolution is to upgrade bosh-dns to 1.16 or higher. The latest patched releases of Operations (Ops) Manager 2.6, 2.7, 2.8+ contains the bosh-dns release with the fix for this issue. Upgrading to the latest patch of any of these releases will resolve the issue.