Foundry81 > Homelab
When everything breaks, DNS is probably the reason – and the fix.

Homelab DNS Troubleshooting

Homelab DNS Troubleshooting

DNS failures are deceptive. Everything looks broken - and most issues are straightforward once you know where to look. Every issue here is something I’ve run into personally.

Misconfigured Records

The server is working - the data is wrong.

Symptoms

  • Wrong Destination: Resolves, but to the wrong IP.
  • Partial resolution: One record works, another doesn’t.

Common Causes

  • Typos: A single digit wrong in an A record.
  • Missing Trailing Dots: BIND appends the zone name again. This is one of the most common “everything looks right” mistakes in BIND.
  • CNAME Loops: A CNAME pointing to a record that points back to the original CNAME.

Solutions

Query the authoritative server directly:

1
dig @ns1.home.foundry81.com docs.home.foundry81.com

Serial Not Updated

This is a synchronization failure between primary and secondary servers.

Symptoms

  • Inconsistent Answers: Some users (hitting the Primary) see the new IP, while others (hitting the secondary) see the old IP.
  • “Ghost” Data: Changes exist on the primary but not the secondary.

Common Causes

  • Stale Serial Number: You updated a record in the zone file but forgot to increment the Serial number in the SOA record (I do this all the time.)
  • Logic: Secondary servers only pull updates when the serial increases. If the number doesn’t go up, nothing happens.

Solutions

Increment Serial: Increment the serial. Every time. No exceptions.

Force Transfer: Use the BIND control utility to force a refresh: rndc retransfer <zone>.

Log Analysis: Check the secondary server’s logs for “zone transfer failed” errors to ensure it isn’t a routing or firewall-related issue.

Propagation Delays

Propagation is cache expiration across resolvers.

Symptoms

  • Inconsistent results: Different answers depending on where you query.

Common Causes

  • High TTL (Time to Live): TTL is still active.

Solutions

Lower The TTL: Lower TTL before making changes, not after.

Caching Issues

Caching speeds things up - until it doesn’t.

Symptoms

  • Local Stale Data: You can confirm that the record is correct on the server, but a specific machine or browser still sees the old IP.
  • “Everyone but Me”: Only one device is wrong - everything else resolves correctly. If one device is wrong, it’s almost never DNS itself.

Common Causes

  • OS Caching: Windows and macOS maintain a local DNS cache.
  • Browser Caching: Chrome and Firefox often cache DNS internally to speed up page loads.

Solutions

Flush the OS cache:

  • Windows: ipconfig /flushdns
  • macOS: sudo killall -HUP mDNSResponder

Clear Browser Cache: Use “Incognito Mode” or clear the browser’s internal DNS cache

Slow Queries

Slow queries are harder to diagnose because everything eventually works.

Symptoms

  • Delayed Resolution: The browser status bar says “Looking up host…” for 2-5 seconds before the page suddenly loads.
  • Timeouts: Some requests fail entirely with a “DNS Timeout” error.

Common Causes

  • Dead Forwarders: Server waits for timeout before trying the next. Timeouts feel like slowness, but they’re usually failure + retry.
  • Recursive Loops: Two DNS servers are configured to forward to each other, creating a loop.
  • Poor Server Resources: The DNS server is running out of RAM or CPU, causing delays in processing requests.

Solutions

  • Optimize Forwarders: Ensure forwarders are fast and reliable, i.e. Quad9 - 9.9.9.9
  • Check Timeout Settings: Adjust the timeout and retry intervals in named.conf.
  • Monitor Resource Usage: Use top or htop to ensure the named process isn’t hitting CPU limits
  • Enable Caching: Ensure that the resolver has enough memory allocated to its cache to reduce the need for recursive lookups.

Tools

When diagnosing DNS issues, you need visibility into how queries are being answered. Two tools are essential:

dig (Domain Information Groper)

dig is the most powerful and precise DNS troubleshooting tool available. It allows you to:

  • Query specific servers directly
  • See full responses
  • Inspect TTL and caching
  • Debug propagation and replication

Example:

1
dig @ns1.home.foundry81.com grafana.home.foundry81.com

This bypasses all intermediate resolvers and asks your authoritative server directly.

If you want to understand what DNS is actually doing, use dig.

nslookup

nslookup is simpler and more accessible.

It allows you to:

  • Perform basic DNS queries
  • Test name resolution
  • Query specific servers

Example:

1
nslookup grafana.home.foundry81.com 192.168.122.15

While it lacks the depth of dig, it’s quick, accessible, and useful for basic checks.

Which Should You Use?

  • Use dig when you need detail and accuracy
  • Use nslookup when you need a quick answer

If you’re troubleshooting a real issue, start with dig.

Troubleshooting Workflow

When DNS breaks, don’t guess - follow a process.

1. Check if the record exists

Query the authoritative server directly: dig @ns1.home.foundry81.com service.home.foundry81.com If it’s not here, it doesn’t exist.

2. Query the correct server

Verify which server you’re querying. Assumptions here waste the most time. Clients may still be using:

  • old DNS settings
  • cached results
  • a different resolver entirely

3. Compare answers across servers

Query:

  • primary
  • secondary
  • client-configured resolver

Differences usually mean:

  • replication issues
  • stale zones
  • serial number problems

4. Eliminate caching

If everything looks correct on the server but wrong on the client:

  • flush OS cache
  • test in incognito
  • try another device

DNS caching is often the culprit when “everything looks right.” Always prove it’s not cache before going deeper.

5. Check TTL and Timing

If changes aren’t showing up:

  • your TTL may still be active
  • resolvers may still be serving cached data

Follow this process and DNS stops being guesswork.

DNS has a reputation for being unpredictable, but most of that comes from not being able to see what it’s doing. Once you know where to look - the authoritative server, the cache, the path a query takes - it becomes far more mechanical than mysterious.

DNS is one of those systems that fades into the background when it’s working and takes the blame when it’s not. The difference now is that you can see what it’s doing - and more importantly, why. And with that, everything built on top of it becomes a lot easier to trust.

Further Reading

Getting in Touch

Have a question? Want to talk tech? Curious about something you saw here?

Reach out. I’m always up for a good conversation, answering a thoughtful question, or geeking out over infrastructure, design, or the overlap between them. I’ll get back to you when I can.

Looking to build something? Launch something? Fix something?

If you see alignment between your work and mine, let’s explore it. I collaborate with IT organizations, creative teams, and builders who value thoughtful execution and clear outcomes. If it’s a good fit, we’ll make it happen.