Fix HostAuthorization blocking internal k8s cluster traffic #239

Merged
ldraney merged 2 commits from fix/host-authorization-internal-ips into main 2026-06-16 12:50:14 +00:00
Owner

Summary

  • PR #232 excluded /up from HostAuthorization, fixing health probes — but all other internal cluster requests arrive with Host: <pod-ip> and get 403'd
  • Production logs are wall-to-wall Blocked hosts: 10.42.0.127:3000, triggering AlertManager
  • Fix: exclude IP-based Host headers from authorization — internal k8s traffic never uses hostnames, so DNS rebinding protection still covers all external traffic through Tailscale

Closes #238

Changes

  • config/environments/production.rb: Broaden host_authorization exclude lambda to also skip requests where the Host header is an IP address (matches \d+.\d+.\d+.\d+ pattern)

Test Plan

  • Pipeline passes
  • Deploy to production
  • Verify Blocked hosts log spam stops
  • Verify external access still works via landscaping-assistant.tail5b443a.ts.net
  • Verify health probes still pass (pod stays 1/1 Running)

Review Checklist

  • Passed automated review-fix loop
  • No secrets committed
  • No unnecessary file changes
  • Commit messages are descriptive
  • Feature flag needed? No — bug fix for infrastructure config
  • ldraney/landscaping-assistant #238 — HostAuthorization blocks internal k8s cluster traffic
  • landscaping-assistant — project
## Summary - PR #232 excluded `/up` from HostAuthorization, fixing health probes — but all other internal cluster requests arrive with `Host: <pod-ip>` and get 403'd - Production logs are wall-to-wall `Blocked hosts: 10.42.0.127:3000`, triggering AlertManager - Fix: exclude IP-based Host headers from authorization — internal k8s traffic never uses hostnames, so DNS rebinding protection still covers all external traffic through Tailscale Closes #238 ## Changes - `config/environments/production.rb`: Broaden `host_authorization` exclude lambda to also skip requests where the Host header is an IP address (matches `\d+.\d+.\d+.\d+` pattern) ## Test Plan - [ ] Pipeline passes - [ ] Deploy to production - [ ] Verify `Blocked hosts` log spam stops - [ ] Verify external access still works via `landscaping-assistant.tail5b443a.ts.net` - [ ] Verify health probes still pass (pod stays 1/1 Running) ## Review Checklist - [ ] Passed automated review-fix loop - [ ] No secrets committed - [ ] No unnecessary file changes - [ ] Commit messages are descriptive - [ ] Feature flag needed? No — bug fix for infrastructure config ## Related Notes - `ldraney/landscaping-assistant #238` — HostAuthorization blocks internal k8s cluster traffic - `landscaping-assistant` — project
Fix HostAuthorization blocking internal k8s cluster traffic
Some checks are pending
ci/woodpecker/push/woodpecker Pipeline was successful
CI / scan_ruby (pull_request) Waiting to run
CI / scan_js (pull_request) Waiting to run
CI / lint (pull_request) Waiting to run
ci/woodpecker/pr/woodpecker Pipeline was successful
e5e334a301
The /up path exclude from PR #232 fixed health probes, but all other
internal cluster requests arrive with Host: <pod-ip> and get 403'd.
Exclude IP-based Host headers from authorization since internal k8s
traffic never uses hostnames — DNS rebinding protection still covers
all external traffic through Tailscale.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Owner

PR #239 Review

DOMAIN REVIEW

Stack: Ruby on Rails 8.1 (production config), k8s infrastructure concern.

Regex correctness:
The regex /\A\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\z/ is well-anchored (\A/\z prevent partial matching) and applied to request.host, which Rails strips of the port component. This means 10.42.0.127:3000 resolves to 10.42.0.127 before matching -- correct behavior.

Minor note: the regex does not validate octets are 0-255 (e.g., 999.999.999.999 would match). This is acceptable -- no legitimate HTTP client sends impossible IPs as Host headers, and the security boundary here is Tailscale Funnel, not the regex itself.

Security analysis -- DNS rebinding protection:
The key question is whether excluding IP-based Host headers weakens DNS rebinding protection. Answer: no. Per docs/networking.md, all external traffic arrives through Tailscale Funnel, which always sets proper hostname-based Host headers (landscaping-assistant.tail5b443a.ts.net, etc.). The only source of IP-based Host headers is internal k8s traffic (kubelet probes, service-to-pod, pod-to-pod). Excluding these is safe because they never originate from the public internet.

Lambda structure:
Clean short-circuit evaluation. The /up path check runs first (cheapest), IP regex only evaluates if needed. Good.

BLOCKERS

None.

NITS

  1. docs/networking.md line 92 is now stale. It currently reads: "The /up health check endpoint is excluded from host authorization so k8s liveness/readiness probes always pass, even when kubelet sends requests using the pod IP as the Host header." This should be updated to reflect that ALL requests with IP-based Host headers are now excluded, not just /up. The current wording implies /up is the only exclusion, which is no longer true after this PR.

  2. Regex precision (non-blocking, informational only): If you ever wanted stricter validation, you could match RFC 1918 ranges specifically (/\A(10\.|172\.(1[6-9]|2\d|3[01])\.|192\.168\.)/). But the current approach is simpler and just as safe given the Tailscale boundary. Not recommending a change -- just noting the option.

SOP COMPLIANCE

  • PR body has: Summary, Changes, Test Plan, Related -- all present and well-written
  • No secrets committed
  • No unnecessary file changes (single file, tightly scoped)
  • Commit messages are descriptive
  • Scope matches issue #238

PROCESS OBSERVATIONS

  • This is a follow-up to PR #232, which fixed /up but missed other internal traffic. The pattern of incremental fixes is fine for a production-impacting bug (wall-to-wall log spam triggering AlertManager), but the docs gap (nit #1) should be addressed to prevent a third iteration.
  • Test plan is solid -- includes both the positive case (log spam stops) and the negative case (external access still works).
  • Change failure risk: low. The lambda is additive (broader exclusion), and the security boundary (Tailscale Funnel) is unaffected.

VERDICT: APPROVED

## PR #239 Review ### DOMAIN REVIEW **Stack:** Ruby on Rails 8.1 (production config), k8s infrastructure concern. **Regex correctness:** The regex `/\A\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\z/` is well-anchored (`\A`/`\z` prevent partial matching) and applied to `request.host`, which Rails strips of the port component. This means `10.42.0.127:3000` resolves to `10.42.0.127` before matching -- correct behavior. Minor note: the regex does not validate octets are 0-255 (e.g., `999.999.999.999` would match). This is acceptable -- no legitimate HTTP client sends impossible IPs as Host headers, and the security boundary here is Tailscale Funnel, not the regex itself. **Security analysis -- DNS rebinding protection:** The key question is whether excluding IP-based Host headers weakens DNS rebinding protection. Answer: no. Per `docs/networking.md`, all external traffic arrives through Tailscale Funnel, which always sets proper hostname-based Host headers (`landscaping-assistant.tail5b443a.ts.net`, etc.). The only source of IP-based Host headers is internal k8s traffic (kubelet probes, service-to-pod, pod-to-pod). Excluding these is safe because they never originate from the public internet. **Lambda structure:** Clean short-circuit evaluation. The `/up` path check runs first (cheapest), IP regex only evaluates if needed. Good. ### BLOCKERS None. ### NITS 1. **docs/networking.md line 92 is now stale.** It currently reads: _"The `/up` health check endpoint is excluded from host authorization so k8s liveness/readiness probes always pass, even when kubelet sends requests using the pod IP as the Host header."_ This should be updated to reflect that ALL requests with IP-based Host headers are now excluded, not just `/up`. The current wording implies `/up` is the only exclusion, which is no longer true after this PR. 2. **Regex precision (non-blocking, informational only):** If you ever wanted stricter validation, you could match RFC 1918 ranges specifically (`/\A(10\.|172\.(1[6-9]|2\d|3[01])\.|192\.168\.)/`). But the current approach is simpler and just as safe given the Tailscale boundary. Not recommending a change -- just noting the option. ### SOP COMPLIANCE - [x] PR body has: Summary, Changes, Test Plan, Related -- all present and well-written - [x] No secrets committed - [x] No unnecessary file changes (single file, tightly scoped) - [x] Commit messages are descriptive - [x] Scope matches issue #238 ### PROCESS OBSERVATIONS - This is a follow-up to PR #232, which fixed `/up` but missed other internal traffic. The pattern of incremental fixes is fine for a production-impacting bug (wall-to-wall log spam triggering AlertManager), but the docs gap (nit #1) should be addressed to prevent a third iteration. - Test plan is solid -- includes both the positive case (log spam stops) and the negative case (external access still works). - Change failure risk: low. The lambda is additive (broader exclusion), and the security boundary (Tailscale Funnel) is unaffected. ### VERDICT: APPROVED
Update networking docs to reflect IP-based host authorization exclude
Some checks failed
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
CI / scan_ruby (pull_request) Has been cancelled
CI / scan_js (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled
bfb0c712fd
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ldraney deleted branch fix/host-authorization-internal-ips 2026-06-16 12:50:15 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/landscaping-assistant!239
No description provided.