Feature: Split-horizon DNS — prevent intra-cluster TLS hairpin through DERP relays #138
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-platform#138
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Feature
Lineage
plan-pal-e-platform— discovered during CI pipeline investigation (#133)Repo
forgejo_admin/pal-e-platformUser Story
As the platform operator
I want intra-cluster traffic to Tailscale funnel hostnames to stay inside the cluster
So that server-to-server communication (OAuth, API calls, image pushes) is reliable and doesn't hairpin through the public internet
Context
All Tailscale funnel hostnames (e.g.,
forgejo.tail5b443a.ts.net) resolve to public DERP relay IPs (208.111.35.209,208.111.34.11) even from inside the cluster. This causes intra-cluster HTTPS traffic to hairpin through the public internet, resulting in ~66% TLS failure rate.Proven impact:
Post "https://forgejo.tail5b443a.ts.net/login/oauth/access_token": EOF) → dead token → can't read.woodpecker.yamlfrom private repos → "no steps" on PR eventsRoot cause proven via DNS test from inside cluster:
nslookup forgejo.tail5b443a.ts.netreturns208.111.35.209(public DERP IP), not the internal ClusterIP10.43.106.198.File Targets
Files to modify:
terraform/main.tf— add Forgejo internal HTTPS service, cert secret, Helm values for TLSterraform/main.tf— CoreDNS ConfigMap customization resourceterraform/main.tf— CronJob for cert renewal viatailscale certFiles NOT to touch:
.woodpecker.yaml— clone fix is separate (PR #134)terraform/network-policies.tf— NetworkPolicies are correct, not the issueAcceptance Criteria
nslookup forgejo.tail5b443a.ts.netreturns internal ClusterIPTest Expectations
kubectl run dns-test --rm -it --image=alpine -- nslookup forgejo.tail5b443a.ts.netreturns internal IPkubectl run tls-test --rm -it --image=alpine/curl -- curl -sI https://forgejo.tail5b443a.ts.netsucceeds 5/5https://forgejo.tail5b443a.ts.netConstraints
forgejo.tail5b443a.ts.nethostname for TLS verificationtailscale certgenerates Let's Encrypt certs for Tailscale hostnames — use thisChecklist
Related
.woodpecker.yaml)plan-pal-e-platform— should become new phase: Split-Horizon DNS