Add SessionStart hook to detect missing MCP servers #76

Closed
opened 2026-03-13 06:40:03 +00:00 by forgejo_admin · 0 comments
Contributor

Lineage

bug-mcp-silent-load-failure (no plan ancestry — bug fix / operational improvement)

Repo

forgejo_admin/claude-custom

User Story

As Betty Sue (main session agent)
I want to be warned at session start if any configured MCP server failed to load
So that I don't discover missing tools mid-session after wasting context

Context

Claude Code silently drops MCP servers that fail to initialize during session startup. In a 2026-03-13 session, both pal-e-docs and forgejo MCP servers were absent — the two most critical tools — while notion and woodpecker loaded fine. No error was surfaced. The only way to discover this was trying to use the tools and getting empty results from ToolSearch.

The root cause is unknown (likely transient timeout during uv run startup with private PyPI index resolution). The fix is observability: detect and warn.

File Targets

Files to create:

  • hooks/check-mcp-servers.sh — SessionStart hook that checks expected MCP servers are loaded

Files to modify:

  • settings.json — add the new hook to the SessionStart hooks array

Acceptance Criteria

  • Hook script reads expected MCP server names from project or global settings
  • Hook compares expected servers against what's actually available in the session
  • If any configured server is missing, hook outputs a clear WARNING message listing which servers failed
  • Warning message suggests restarting the session
  • Hook does NOT block the session (uses permissionDecision: allow or outputs warning only)
  • Hook is registered in settings.json under SessionStart

Test Expectations

  • Manual test: temporarily misconfigure an MCP server (wrong path), start a session, verify the warning appears
  • Verify hook doesn't break when all servers load correctly

Constraints

  • The hook runs as a shell script and receives session context via stdin (JSON)
  • Look at existing SessionStart hooks (session-start-context.sh, check-claude-custom-clean.sh) for the pattern
  • The hook should be lightweight — no network calls, just local file reads and string comparison
  • The challenge is that the hook runs in the shell, not in the Claude session — it may not have direct access to the MCP tool registry. Consider alternative approaches:
    • Check if MCP server processes are running (pgrep or ps)
    • Add a check-mcp directive to the session-start-context.sh output that tells Betty Sue to verify tools
    • Create a small test script that tries to connect to each MCP server's stdio

Checklist

  • PR opened
  • Tests pass (if applicable)
  • No unrelated changes
  • ai-agency — project
  • bug-mcp-silent-load-failure — investigation note
### Lineage `bug-mcp-silent-load-failure` (no plan ancestry — bug fix / operational improvement) ### Repo `forgejo_admin/claude-custom` ### User Story As Betty Sue (main session agent) I want to be warned at session start if any configured MCP server failed to load So that I don't discover missing tools mid-session after wasting context ### Context Claude Code silently drops MCP servers that fail to initialize during session startup. In a 2026-03-13 session, both `pal-e-docs` and `forgejo` MCP servers were absent — the two most critical tools — while `notion` and `woodpecker` loaded fine. No error was surfaced. The only way to discover this was trying to use the tools and getting empty results from `ToolSearch`. The root cause is unknown (likely transient timeout during `uv run` startup with private PyPI index resolution). The fix is observability: detect and warn. ### File Targets Files to create: - `hooks/check-mcp-servers.sh` — SessionStart hook that checks expected MCP servers are loaded Files to modify: - `settings.json` — add the new hook to the `SessionStart` hooks array ### Acceptance Criteria - [ ] Hook script reads expected MCP server names from project or global settings - [ ] Hook compares expected servers against what's actually available in the session - [ ] If any configured server is missing, hook outputs a clear WARNING message listing which servers failed - [ ] Warning message suggests restarting the session - [ ] Hook does NOT block the session (uses `permissionDecision: allow` or outputs warning only) - [ ] Hook is registered in settings.json under SessionStart ### Test Expectations - Manual test: temporarily misconfigure an MCP server (wrong path), start a session, verify the warning appears - Verify hook doesn't break when all servers load correctly ### Constraints - The hook runs as a shell script and receives session context via stdin (JSON) - Look at existing SessionStart hooks (`session-start-context.sh`, `check-claude-custom-clean.sh`) for the pattern - The hook should be lightweight — no network calls, just local file reads and string comparison - The challenge is that the hook runs in the shell, not in the Claude session — it may not have direct access to the MCP tool registry. Consider alternative approaches: - Check if MCP server processes are running (`pgrep` or `ps`) - Add a check-mcp directive to the session-start-context.sh output that tells Betty Sue to verify tools - Create a small test script that tries to connect to each MCP server's stdio ### Checklist - [ ] PR opened - [ ] Tests pass (if applicable) - [ ] No unrelated changes ### Related - `ai-agency` — project - `bug-mcp-silent-load-failure` — investigation note
forgejo_admin 2026-03-14 14:06:41 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/claude-custom#76
No description provided.