feat: dual auth — accept Keycloak JWT alongside API key #269

Merged
forgejo_admin merged 1 commit from 267-dual-auth-middleware into main 2026-04-13 14:41:01 +00:00

Summary

Add JWT validation via PyJWT + JWKS cache so the backend can identify WHO is making a request (Keycloak user) while preserving the existing API key auth for MCP agents. Adds GET /me endpoint and get_current_user() dependency for downstream identity features.

Changes

  • pyproject.toml — added PyJWT[crypto]>=2.8 dependency
  • src/pal_e_docs/config.py — added keycloak_url and keycloak_realm settings (env: PALDOCS_KEYCLOAK_URL, PALDOCS_KEYCLOAK_REALM)
  • src/pal_e_docs/auth.py — added UserContext dataclass, _JWKSCache (lazy init, cached keys), _decode_jwt(), _extract_user_context(), get_current_user() FastAPI dependency. get_is_authenticated() is untouched
  • src/pal_e_docs/routes/users.py (new) — GET /me returns {sub, email, name, roles} from JWT or 401
  • src/pal_e_docs/main.py — registered users_router
  • tests/test_dual_auth.py (new) — 15 tests covering unit + integration: valid JWT, expired JWT, missing auth, JWKS failure, API key regression

Test Plan

  • pytest tests/ -v — 700 passed (all 144 existing + 15 new dual-auth tests)
  • ruff check and ruff format clean
  • Manual: curl -H "Authorization: Bearer {jwt}" /me returns claims
  • Manual: curl -H "X-PALEDocs-Token: {key}" /notes still works

Review Checklist

  • get_is_authenticated() unchanged — 20 endpoints unaffected
  • JWKS keys cached (lazy init, cache_keys=True)
  • Invalid/expired JWT returns None, no crash
  • No unrelated changes
  • All 700 tests pass
  • ruff clean

None — standalone feature ticket.

Closes #267

## Summary Add JWT validation via PyJWT + JWKS cache so the backend can identify WHO is making a request (Keycloak user) while preserving the existing API key auth for MCP agents. Adds `GET /me` endpoint and `get_current_user()` dependency for downstream identity features. ## Changes - `pyproject.toml` — added `PyJWT[crypto]>=2.8` dependency - `src/pal_e_docs/config.py` — added `keycloak_url` and `keycloak_realm` settings (env: `PALDOCS_KEYCLOAK_URL`, `PALDOCS_KEYCLOAK_REALM`) - `src/pal_e_docs/auth.py` — added `UserContext` dataclass, `_JWKSCache` (lazy init, cached keys), `_decode_jwt()`, `_extract_user_context()`, `get_current_user()` FastAPI dependency. `get_is_authenticated()` is **untouched** - `src/pal_e_docs/routes/users.py` (new) — `GET /me` returns `{sub, email, name, roles}` from JWT or 401 - `src/pal_e_docs/main.py` — registered `users_router` - `tests/test_dual_auth.py` (new) — 15 tests covering unit + integration: valid JWT, expired JWT, missing auth, JWKS failure, API key regression ## Test Plan - `pytest tests/ -v` — 700 passed (all 144 existing + 15 new dual-auth tests) - `ruff check` and `ruff format` clean - Manual: `curl -H "Authorization: Bearer {jwt}" /me` returns claims - Manual: `curl -H "X-PALEDocs-Token: {key}" /notes` still works ## Review Checklist - [x] `get_is_authenticated()` unchanged — 20 endpoints unaffected - [x] JWKS keys cached (lazy init, `cache_keys=True`) - [x] Invalid/expired JWT returns None, no crash - [x] No unrelated changes - [x] All 700 tests pass - [x] ruff clean ## Related Notes None — standalone feature ticket. ## Related Closes #267
Add JWT validation via PyJWT + JWKS cache so the backend can identify
WHO is making a request (Keycloak user) while preserving the existing
API key auth for MCP agents. Adds GET /me endpoint returning user claims
and a get_current_user() FastAPI dependency for downstream use.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author
Owner

PR #269 Review

DOMAIN REVIEW

Stack: Python / FastAPI / PyJWT / SQLAlchemy (pal-e-docs API)

auth.py additions (lines +1 to +87 in diff)

  1. verify_aud: False -- Audience validation is disabled in jwt.decode(). This is acceptable when the API is not registered as a specific OIDC client and accepts tokens issued to any Keycloak client in the realm. However, this means ANY valid Keycloak JWT for the pal-e realm will authenticate, regardless of which client it was issued to. Document this decision if it is intentional. If this API should only accept tokens from specific clients, add audience validation.

  2. Overly broad exception catch -- except (jwt.ExpiredSignatureError, jwt.InvalidTokenError, Exception) is redundant. The Exception base class subsumes both PyJWT exceptions. This reads as if specific exceptions matter, but they do not in this catch. Simplify to except Exception or keep the specific ones and drop the base class (logging at DEBUG level could differentiate them):

    # Current (misleading):
    except (jwt.ExpiredSignatureError, jwt.InvalidTokenError, Exception) as exc:
    # Cleaner:
    except Exception as exc:
    
  3. Empty sub silently allowed -- _extract_user_context uses claims.get("sub", ""). A JWT without a sub claim is malformed per OIDC spec. Returning a UserContext(sub="") could cause downstream bugs when identity features consume this. Consider raising or returning None for missing sub.

  4. JWKS cache has no TTL -- PyJWKClient(jwks_url, cache_keys=True) caches keys indefinitely. If Keycloak rotates signing keys, the cache will hold stale keys until the process restarts or _jwks_cache.reset() is called. PyJWT's cache_jwk_set + lifespan parameters (available in PyJWT >=2.8) offer TTL-based cache expiry. Worth adding for production resilience.

config.py additions

  1. Hardcoded Tailscale URL as default -- keycloak_url: str = "https://keycloak.tail5b443a.ts.net" is a production infrastructure URL baked as a default. This works because env_prefix allows override, but it means running tests or local dev without setting the env var will attempt to reach production Keycloak during JWKS fetch. The lazy init in _JWKSCache mitigates this (no fetch until first JWT arrives), so it is not a runtime bug, but a "" default with validation would be more explicit.

routes/users.py

  1. No Pydantic response model -- GET /me returns a raw dict. Adding a response_model (or returning a Pydantic model) gives OpenAPI schema generation, response validation, and prevents accidental data leakage if UserContext grows fields later.

tests/test_dual_auth.py

  1. Test coverage is solid -- 15 tests covering: valid JWT decode, expired JWT, JWKS failure, UserContext extraction (full claims, missing name fallback, missing realm_access), get_current_user (valid, none, non-bearer, expired), GET /me integration (valid, no auth, invalid token, expired), and API key regression. This meets the coverage bar.

  2. client fixture dependency is correct -- Integration tests use the client fixture from conftest.py, which provides TestClient(app) with in-memory SQLite. The fixture is not imported in the test file because pytest discovers it automatically.

get_is_authenticated preservation -- Verified against current main. The function at lines 13-24 of the existing auth.py is completely unchanged in the diff. The 20 existing endpoints are unaffected.

BLOCKERS

None.

  • Tests exist (15 new tests covering unit + integration paths).
  • No unvalidated user input -- JWT validation delegated to PyJWT library with RS256 algorithm pinning.
  • No secrets or credentials in code (Keycloak URL is infrastructure, not a secret).
  • No DRY violation in auth paths -- get_is_authenticated (API key) and get_current_user (JWT) are separate, independent auth dependencies with different responsibilities.

NITS

  1. Simplify exception catch in _decode_jwt -- the Exception base class makes ExpiredSignatureError and InvalidTokenError redundant in the tuple.
  2. Add a Pydantic UserResponse model for GET /me response -- improves OpenAPI docs and prevents accidental field leakage.
  3. Consider validating sub is non-empty in _extract_user_context rather than defaulting to empty string.
  4. Consider adding lifespan parameter to PyJWKClient for JWKS cache TTL (key rotation resilience).
  5. Consider setting keycloak_url default to "" with a runtime check, rather than hardcoding the production Tailscale URL.

SOP COMPLIANCE

  • Branch named after issue (267-dual-auth-middleware matches issue #267)
  • PR body follows template (Summary, Changes, Test Plan, Related present)
  • Related references issue (Closes #267)
  • No plan slug (standalone feature ticket -- acceptable)
  • No secrets committed
  • No scope creep -- all changes serve the dual auth feature
  • Commit messages are descriptive (PR title follows conventional commits)

PROCESS OBSERVATIONS

  • Change failure risk: Low. The critical constraint (20 endpoints depending on get_is_authenticated) is satisfied -- that function is untouched. The new get_current_user dependency is opt-in (only GET /me uses it). No existing behavior changes.
  • Deployment frequency: This unblocks the user attribution feature (#268) and permission filtering (#250), both of which depend on knowing WHO is making requests. Good sequencing.
  • Documentation: The verify_aud: False decision should be documented somewhere (inline comment or ADR) so future reviewers understand it is intentional, not an oversight.

VERDICT: APPROVED

## PR #269 Review ### DOMAIN REVIEW **Stack**: Python / FastAPI / PyJWT / SQLAlchemy (pal-e-docs API) **auth.py additions (lines +1 to +87 in diff)** 1. **`verify_aud: False`** -- Audience validation is disabled in `jwt.decode()`. This is acceptable when the API is not registered as a specific OIDC client and accepts tokens issued to any Keycloak client in the realm. However, this means ANY valid Keycloak JWT for the `pal-e` realm will authenticate, regardless of which client it was issued to. Document this decision if it is intentional. If this API should only accept tokens from specific clients, add `audience` validation. 2. **Overly broad exception catch** -- `except (jwt.ExpiredSignatureError, jwt.InvalidTokenError, Exception)` is redundant. The `Exception` base class subsumes both PyJWT exceptions. This reads as if specific exceptions matter, but they do not in this catch. Simplify to `except Exception` or keep the specific ones and drop the base class (logging at DEBUG level could differentiate them): ```python # Current (misleading): except (jwt.ExpiredSignatureError, jwt.InvalidTokenError, Exception) as exc: # Cleaner: except Exception as exc: ``` 3. **Empty `sub` silently allowed** -- `_extract_user_context` uses `claims.get("sub", "")`. A JWT without a `sub` claim is malformed per OIDC spec. Returning a `UserContext(sub="")` could cause downstream bugs when identity features consume this. Consider raising or returning None for missing `sub`. 4. **JWKS cache has no TTL** -- `PyJWKClient(jwks_url, cache_keys=True)` caches keys indefinitely. If Keycloak rotates signing keys, the cache will hold stale keys until the process restarts or `_jwks_cache.reset()` is called. PyJWT's `cache_jwk_set` + `lifespan` parameters (available in PyJWT >=2.8) offer TTL-based cache expiry. Worth adding for production resilience. **config.py additions** 5. **Hardcoded Tailscale URL as default** -- `keycloak_url: str = "https://keycloak.tail5b443a.ts.net"` is a production infrastructure URL baked as a default. This works because `env_prefix` allows override, but it means running tests or local dev without setting the env var will attempt to reach production Keycloak during JWKS fetch. The lazy init in `_JWKSCache` mitigates this (no fetch until first JWT arrives), so it is not a runtime bug, but a `""` default with validation would be more explicit. **routes/users.py** 6. **No Pydantic response model** -- `GET /me` returns a raw dict. Adding a `response_model` (or returning a Pydantic model) gives OpenAPI schema generation, response validation, and prevents accidental data leakage if `UserContext` grows fields later. **tests/test_dual_auth.py** 7. **Test coverage is solid** -- 15 tests covering: valid JWT decode, expired JWT, JWKS failure, `UserContext` extraction (full claims, missing name fallback, missing realm_access), `get_current_user` (valid, none, non-bearer, expired), `GET /me` integration (valid, no auth, invalid token, expired), and API key regression. This meets the coverage bar. 8. **`client` fixture dependency is correct** -- Integration tests use the `client` fixture from `conftest.py`, which provides `TestClient(app)` with in-memory SQLite. The fixture is not imported in the test file because pytest discovers it automatically. **`get_is_authenticated` preservation** -- Verified against current `main`. The function at lines 13-24 of the existing `auth.py` is completely unchanged in the diff. The 20 existing endpoints are unaffected. ### BLOCKERS None. - Tests exist (15 new tests covering unit + integration paths). - No unvalidated user input -- JWT validation delegated to PyJWT library with RS256 algorithm pinning. - No secrets or credentials in code (Keycloak URL is infrastructure, not a secret). - No DRY violation in auth paths -- `get_is_authenticated` (API key) and `get_current_user` (JWT) are separate, independent auth dependencies with different responsibilities. ### NITS 1. Simplify exception catch in `_decode_jwt` -- the `Exception` base class makes `ExpiredSignatureError` and `InvalidTokenError` redundant in the tuple. 2. Add a Pydantic `UserResponse` model for `GET /me` response -- improves OpenAPI docs and prevents accidental field leakage. 3. Consider validating `sub` is non-empty in `_extract_user_context` rather than defaulting to empty string. 4. Consider adding `lifespan` parameter to `PyJWKClient` for JWKS cache TTL (key rotation resilience). 5. Consider setting `keycloak_url` default to `""` with a runtime check, rather than hardcoding the production Tailscale URL. ### SOP COMPLIANCE - [x] Branch named after issue (`267-dual-auth-middleware` matches issue #267) - [x] PR body follows template (Summary, Changes, Test Plan, Related present) - [x] Related references issue (`Closes #267`) - [x] No plan slug (standalone feature ticket -- acceptable) - [x] No secrets committed - [x] No scope creep -- all changes serve the dual auth feature - [x] Commit messages are descriptive (PR title follows conventional commits) ### PROCESS OBSERVATIONS - **Change failure risk**: Low. The critical constraint (20 endpoints depending on `get_is_authenticated`) is satisfied -- that function is untouched. The new `get_current_user` dependency is opt-in (only `GET /me` uses it). No existing behavior changes. - **Deployment frequency**: This unblocks the user attribution feature (#268) and permission filtering (#250), both of which depend on knowing WHO is making requests. Good sequencing. - **Documentation**: The `verify_aud: False` decision should be documented somewhere (inline comment or ADR) so future reviewers understand it is intentional, not an oversight. ### VERDICT: APPROVED
forgejo_admin deleted branch 267-dual-auth-middleware 2026-04-13 14:41:01 +00:00
Sign in to join this conversation.
No description provided.