Tailscale Connector — subnet router for k8s service CIDR #175
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-platform#175
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Infra
Lineage
project-pal-e-platform→ Board item. Blocker for #174 (Mac build agent).Repo
forgejo_admin/pal-e-platformUser Story
As an external tailnet peer (Mac build agent)
I want to reach k8s ClusterIP services directly over Tailscale
So that the Woodpecker agent can connect to gRPC port 9000 without funnels or hacks
Context
External agents can't reach k8s services (10.43.0.0/16) because those IPs are cluster-internal. Tailscale funnels don't work for gRPC (HTTP/1.1 proxy breaks HTTP/2). The Tailscale operator provides a
ConnectorCRD withsubnetRouterthat advertises CIDRs to the tailnet. This creates a subnet router pod that bridges the k8s service network to the Tailscale mesh.After this, any tailnet peer can reach any ClusterIP service. The Mac agent connects to
10.43.50.207:9000(Woodpecker server) or any other service.File Targets
Files to modify:
terraform/main.tf— addkubernetes_manifestfor Tailscale Connector CRDFiles to verify:
terraform/main.tf— existingtailscale_aclmay need a grant for subnet route approval (check iftag:k8sauto-approves)Files NOT to touch:
Acceptance Criteria
tofu planshows 1 new resource (Connector)tofu applycreates the subnet router podtailscale statusshows the subnet router advertising10.43.0.0/16ping 10.43.50.207succeeds (Woodpecker server ClusterIP)10.43.50.207:9000successfullyTest Expectations
tofu validatepasseskubectl get connectorshows the resourceConstraints
kubernetes_manifestresource in Terraform (not kubectl apply)10.43.0.0/16(service CIDR), NOT10.42.0.0/16(pod CIDR) — services are stable, pod IPs aren'tautoApproversor manual approval)tofu plan -lock=falseper SOPChecklist
tofu planoutput includedRelated
project-capacitor-mobile— consumer of the Mac build agentTicket #175 Scope Review
TEMPLATE COMPLIANCE (template-issue)
All required sections present and well-structured:
forgejo_admin/pal-e-platformtofu plan -lock=falseper SOPTRACEABILITY TRIANGLE
project-pal-e-platformuser stories table: "I can deploy infrastructure changes viatofu plan/applyand see them succeed in Woodpecker CI without manual intervention."convention-architecture-idsunder Deployment Components: "Ingress + TLS." However, see note below about whether a new arch ID is needed.TECHNICAL REVIEW
1. Connector CRD spec is correct but underspecified in File Targets
The Tailscale Connector CRD is
apiVersion: tailscale.com/v1alpha1,kind: Connector. The required field isspec.subnetRouter.advertiseRoutes(array of CIDRs, minimum 1). The issue correctly identifies10.43.0.0/16as the target CIDR and correctly excludes10.42.0.0/16(pod CIDR). Usingkubernetes_manifestis the right Terraform resource for CRDs.Optional fields the agent should know about:
spec.tags-- defaults to[tag:k8s]from the operator config, which matches the existing setupspec.hostnameorspec.hostnamePrefix-- one should be specified for a readable tailnet device name (e.g.,hostnamePrefix: "k8s-subnet-router")spec.replicas-- defaults to nil (single replica), fine for this use caseRecommendation: Add a note in File Targets or Constraints that the agent should set
hostnamePrefixfor clarity intailscale statusoutput.2. ACL autoApprovers is a REQUIRED change, not just a "verify" step
The issue says to verify whether
tag:k8sauto-approves subnet routes. It does NOT -- the current ACL (main.tf lines 39-98) has noautoApproversblock. Without it, the subnet route will require manual approval in the Tailscale admin console every timetofu applyruns.The fix is to add to the
tailscale_aclresource:This should be promoted from "Files to verify" to "Files to modify" in the File Targets section, and an acceptance criterion should cover it.
3. Existing gRPC funnel creates a scope question
main.tflines 825-854 already define akubernetes_ingress_v1.woodpecker_grpc_funnelresource that creates a Tailscale funnel for Woodpecker gRPC on port 9000. The issue Context says "Tailscale funnels don't work for gRPC (HTTP/1.1 proxy breaks HTTP/2)" -- but this existing funnel is live in the codebase.Two possibilities:
Either way, the relationship between the new Connector and the existing gRPC funnel resource must be addressed. This is not scope creep -- it is essential context for the agent to avoid confusion.
4. NetworkPolicy gap
The existing
network-policies.tfallows traffic from thetailscalenamespace to service namespaces. The subnet router pod will live in thetailscalenamespace, so it should be able to reach ClusterIP services. However, the agent should verify that the subnet router traffic flows through the tailscale namespace (not directly from external) -- if k8s routes the traffic differently, a NetworkPolicy update may be needed. This should be mentioned in Constraints or as a test expectation.5. Dependency verification
Issue #174 (Mac build agent) is confirmed open and is correctly identified as a downstream consumer. The blocker relationship is correct -- #174 needs subnet routing before the Mac agent can reach the Woodpecker gRPC endpoint.
ITEMS TO FIX
ACL autoApprovers -- Promote from "verify" to explicit file modification target. Add
autoApprovers.routesblock for10.43.0.0/16to thetailscale_aclresource. Add an acceptance criterion: "ACL autoApprovers auto-approves the subnet route (no manual admin console step)."Existing gRPC funnel -- Address the relationship with
kubernetes_ingress_v1.woodpecker_grpc_funnel(main.tf:827). Either: (a) note it doesn't work for gRPC and state whether removal is in-scope or deferred, or (b) correct the Context claim if funnels do work.hostnamePrefix -- Add to Constraints: agent should set
spec.hostnamePrefix(e.g.,"k8s-subnet-router") so the device is identifiable intailscale status.arch: label fit --
arch:tailscale-funnelmaps to "Ingress + TLS" in the architecture IDs convention. A subnet router is not an ingress/funnel -- it is internal network bridging. Consider whether this warrants a new arch ID (e.g.,arch:tailscale-subnet) or whetherarch:tailscale-funnelis intentionally broad enough to cover all Tailscale networking. Minor, but worth a conscious decision.VERDICT: NEEDS WORK
The ticket is well-written and close to ready. Four items need addressing before an agent can execute cleanly:
autoApproversmust be an explicit modification target, not a "verify" footnotehostnamePrefixconstraint for tailnet device namingTicket Fix (QA review remediation)
1. ACL autoApprovers (promoted to File Targets)
terraform/main.tfline ~40 (tailscale_acl.this) needs anautoApproversblock added:This auto-approves subnet routes advertised by
tag:k8snodes (which the Connector gets via the operator'sdefaultTags). Without this, everytofu applywould require manual admin console approval.Moved from "Files to verify" to "Files to modify."
2. gRPC funnel contradiction (clarified)
The
woodpecker_grpc_funnelresource (PR #173, merged today) IS non-functional for gRPC — Tailscale funnels proxy as HTTP/1.1 which breaks gRPC's HTTP/2 framing. Confirmed by proxy logs:"http: proxy error: malformed HTTP response".Removal of the non-functional funnel is OUT OF SCOPE for this ticket. It should be cleaned up in a follow-up. This ticket adds the working solution (subnet router). The funnel is harmless — it just doesn't serve the purpose we created it for.
3. hostnamePrefix (added to Constraints)
The Connector should specify
hostname: "k8s-subnet-router"sotailscale statusshows a readable name instead of a generated hash.4. Arch label (corrected)
Changed from
arch:tailscale-funneltoarch:tailscale-subnet. This is a new arch component — subnet routing is distinct from ingress/TLS. Will addarch:tailscale-subnetto the pal-e-platform architecture docs as discovered scope.Issue #175 Re-Review
Re-review of ticket scope after fix comment (comment #7556) addressed the 4 findings from the initial QA review (comment #7554).
FINDING 1: ACL autoApprovers -- RESOLVED
Original finding:
autoApproversblock missing fromtailscale_acl.this(main.tf lines 39-98). Promoted from "Files to verify" to "Files to modify."Fix comment response: Provides the exact JSON snippet to add. Confirms
tag:k8scomes from the operator'sdefaultTags. Notes manual admin console approval would be required without this.Verification: Confirmed
tailscale_acl.this(main.tf:39-98) has zeroautoApproversconfiguration. The proposed fix is correct --"routes": { "10.43.0.0/16": ["tag:k8s"] }will auto-approve the subnet route. This is a required change, not optional.Status: RESOLVED.
FINDING 2: gRPC Funnel Contradiction -- RESOLVED
Original finding: Issue claims "funnels don't work for gRPC" but
kubernetes_ingress_v1.woodpecker_grpc_funnel(main.tf:827-854) exists in the codebase (merged in PR #173).Fix comment response: Confirms the funnel IS non-functional for gRPC -- HTTP/1.1 proxy breaks HTTP/2 framing. Proxy logs show
"http: proxy error: malformed HTTP response". Removal deferred as out-of-scope. Funnel is harmless.Verification: Confirmed the resource exists at main.tf:827-854. The deferral is correct -- removing a non-functional but harmless resource is cleanup, not a prerequisite for the subnet router. A follow-up ticket should be created for the cleanup.
Status: RESOLVED.
FINDING 3: hostnamePrefix -- RESOLVED (with note)
Original finding: Recommended
hostnamePrefix: "k8s-subnet-router"for readabletailscale statusoutput.Fix comment response: Says
hostname: "k8s-subnet-router"(nothostnamePrefix).Verification: Both fields are valid in the Connector CRD.
hostnamesets an exact name;hostnamePrefixappends a generated suffix (useful for replicas). For a single-replica connector,hostnameis simpler and appropriate. The intent (readable device name in tailscale status) is preserved either way.Status: RESOLVED.
FINDING 4: Arch Label -- RESOLVED
Original finding:
arch:tailscale-funnel("Ingress + TLS") doesn't fit subnet routing. Recommended either a new arch ID or a conscious decision.Fix comment response: Changed to
arch:tailscale-subnet. Will add toconvention-architecture-idsas discovered scope.Verification: Board item #394 confirmed updated to
labels: "story:superuser-deploy,arch:tailscale-subnet,type:infra". Theconvention-architecture-idsnote currently listsarch:tailscale-funnelunder Deployment Components but does NOT yet includearch:tailscale-subnet. The fix comment correctly flags this as discovered scope -- the convention doc update should happen during/update-docsafter this ticket's PR merges.Status: RESOLVED.
TRACEABILITY VERIFICATION
project-pal-e-platformuser stories: "I can deploy infrastructure changes viatofu plan/applyand see them succeed in Woodpecker CI without manual intervention." The subnet router enables the Mac build agent to reach the Woodpecker gRPC endpoint, which is a prerequisite for the deploy pipeline to work with external agents. Traceability is correct.arch:tailscale-funnel. Convention doc update pending as discovered scope.PROCESS OBSERVATION
The issue body has NOT been edited to incorporate the fixes (issue
created_atandupdated_atare identical:2026-03-26T17:14:53Z). The fixes exist only in comment #7556. When the dev agent executes this ticket, it must read both the issue body AND the comments to get the complete spec. This is fine -- agents do read comments -- but for maximum clarity, consider editing the issue body to incorporate the 4 changes so the agent has a single coherent spec to execute against. This is a nit, not a blocker.SOP COMPLIANCE
template-issueformat (all required sections present)project-pal-e-platform-- board item, not plan-driven)tofu plan -lock=falseper SOPstory:superuser-deploy,arch:tailscale-subnet,type:infra)VERDICT: APPROVED
All 4 findings from the initial review are resolved. The ticket is ready for agent execution. One nit: consider editing the issue body to incorporate the fix comment changes for a single coherent spec.