Replace Anthropic SDK with Ollama and add purpose-built read-only tools #29

Open
opened 2026-04-03 17:38:04 +00:00 by forgejo_admin · 2 comments

Type

Feature

Lineage

Supersedes #27 (read-only SQL tool — decided against direct SQL, using REST via basketball-api instead). Related to Nemo being dead since 2026-03-31 due to Anthropic credit exhaustion.

Repo

forgejo_admin/westside-ai-assistant

User Story

As an admin (Marcus)
I want to ask Nemo questions about players, rosters, and payments in GroupMe and get accurate answers
So that I can manage the program without logging into the admin dashboard or asking Lucas

Context

Nemo currently uses the Anthropic SDK with a generic api_get tool that lets the model pick any basketball-api endpoint. This is too open-ended for qwen3.5:4b — a 4B model needs purpose-built tools with simple inputs and pre-formatted string outputs.

The current architecture:

  • app/ai.py — Anthropic SDK, 1 generic read tool + 7 write tools, system prompt with endpoint list
  • app/basketball.py — Keycloak-authed HTTP client for basketball-api (KEEP THIS)
  • app/groupme.py — GroupMe webhook handler (KEEP THIS)
  • app/confirmation.py — Write confirmation state machine (REMOVE — V1 is read-only)

The new architecture:

  • Replace anthropic.messages.create() with Ollama HTTP API (POST /api/chat with tools)
  • Replace generic api_get with 4 purpose-built tools
  • Remove all 7 write tools (V1 is read-only)
  • Update system prompt for qwen3.5:4b capabilities
  • Keep BasketballClient (Keycloak auth + HTTP) — tools call it internally

File Targets

Files to modify:

  • app/ai.py — replace Anthropic SDK with Ollama HTTP calls, replace TOOLS list with 4 purpose-built tools, update SYSTEM_PROMPT, update process_message to use Ollama chat format
  • app/config.py — add ollama_base_url and ollama_model settings, remove anthropic_api_key and anthropic_model
  • requirements.txt — remove anthropic dependency, add ollama (or use raw httpx)

Files to create:

  • app/tools.py — 4 tool handler functions, each calls BasketballClient and returns formatted string:
    • search_player(name: str) -> str — search players by name, return player + parent contact info
    • list_roster(team_name: str) -> str — list all players on a team with positions and parent names
    • payment_status(name: str | None) -> str — check payment/subscription status, optionally filtered by name
    • program_stats() -> str — dashboard numbers (registered, active, overdue, team counts)

Files to delete:

  • app/confirmation.py — write confirmation state machine not needed for V1 read-only mode

Files NOT to touch:

  • app/basketball.py — Keycloak client stays as-is
  • app/groupme.py — webhook handler stays as-is (it calls process_message which we're changing internally)

Tool Schemas

search_player:

Input:  { "name": "string" }
Output: "Aleiyah Apaisa — Parent: Sandra Apaisa — Phone: (385) 232-9501 — Email: apaisasandra@gmail.com — Team: 17U Elite Queens"
        (multiple results separated by newlines if partial match returns multiple)

list_roster:

Input:  { "team_name": "string" }
Output: "17U Elite Queens (9 players):\n1. Aleiyah Apaisa — SF — Parent: Sandra Apaisa\n2. Ayvah Apaisa — ..."

payment_status:

Input:  { "name": "string" (optional — omit for full overview) }
Output: "Sandra Apaisa: Active (paid through April)\n" or "Overview: 49/58 current (84%), 9 overdue"

program_stats:

Input:  {} (no params)
Output: "Registered: 58 | Active: 49 | Teams: 7 | Overdue: 9 | Collection rate: 84%"

Acceptance Criteria

  • Nemo responds to GroupMe messages using qwen3.5:4b via cluster Ollama
  • "What's the phone number for Apaisa?" returns Sandra Apaisa's phone (385) 232-9501
  • "Who's on the 17U Elite Queens?" returns the roster
  • "How many players are registered?" returns program stats
  • No write operations available (V1 read-only)
  • Conversation history still works (multi-turn context)
  • Anthropic SDK fully removed from dependencies
  • Error handling: if Ollama is unreachable, respond "I'm having trouble connecting right now"

Test Expectations

  • Unit test: search_player returns formatted string for known player
  • Unit test: list_roster returns formatted roster for known team
  • Unit test: program_stats returns formatted dashboard numbers
  • Integration test: process_message with mocked Ollama returns tool call + formatted response
  • Run command: pytest tests/

Constraints

  • Use Ollama HTTP API directly (httpx POST to /api/chat) — avoid adding heavy dependencies
  • Tool handlers call BasketballClient (existing Keycloak-authed HTTP client) — do NOT add direct Postgres access
  • Keep response length under 1000 chars for GroupMe
  • System prompt should include 2-3 examples of questions → tool mappings for qwen3.5:4b
  • Tools return pre-formatted strings — model should NOT need to parse JSON

Checklist

  • PR opened
  • Tests pass
  • No unrelated changes
  • Anthropic SDK removed from dependencies
  • project-westside-ai-assistant
  • Supersedes: #27 (read-only SQL tool)
  • Depends on: #30 (qwen3.5:4b deployed in cluster + env vars set)
  • Future V2: extract tools to basketball-mcp server when second consumer exists
### Type Feature ### Lineage Supersedes #27 (read-only SQL tool — decided against direct SQL, using REST via basketball-api instead). Related to Nemo being dead since 2026-03-31 due to Anthropic credit exhaustion. ### Repo `forgejo_admin/westside-ai-assistant` ### User Story As an **admin (Marcus)** I want to **ask Nemo questions about players, rosters, and payments in GroupMe and get accurate answers** So that I can **manage the program without logging into the admin dashboard or asking Lucas** ### Context Nemo currently uses the Anthropic SDK with a generic `api_get` tool that lets the model pick any basketball-api endpoint. This is too open-ended for qwen3.5:4b — a 4B model needs purpose-built tools with simple inputs and pre-formatted string outputs. The current architecture: - `app/ai.py` — Anthropic SDK, 1 generic read tool + 7 write tools, system prompt with endpoint list - `app/basketball.py` — Keycloak-authed HTTP client for basketball-api (KEEP THIS) - `app/groupme.py` — GroupMe webhook handler (KEEP THIS) - `app/confirmation.py` — Write confirmation state machine (REMOVE — V1 is read-only) The new architecture: - Replace `anthropic.messages.create()` with Ollama HTTP API (`POST /api/chat` with tools) - Replace generic `api_get` with 4 purpose-built tools - Remove all 7 write tools (V1 is read-only) - Update system prompt for qwen3.5:4b capabilities - Keep BasketballClient (Keycloak auth + HTTP) — tools call it internally ### File Targets Files to modify: - `app/ai.py` — replace Anthropic SDK with Ollama HTTP calls, replace TOOLS list with 4 purpose-built tools, update SYSTEM_PROMPT, update process_message to use Ollama chat format - `app/config.py` — add `ollama_base_url` and `ollama_model` settings, remove `anthropic_api_key` and `anthropic_model` - `requirements.txt` — remove `anthropic` dependency, add `ollama` (or use raw httpx) Files to create: - `app/tools.py` — 4 tool handler functions, each calls BasketballClient and returns formatted string: - `search_player(name: str) -> str` — search players by name, return player + parent contact info - `list_roster(team_name: str) -> str` — list all players on a team with positions and parent names - `payment_status(name: str | None) -> str` — check payment/subscription status, optionally filtered by name - `program_stats() -> str` — dashboard numbers (registered, active, overdue, team counts) Files to delete: - `app/confirmation.py` — write confirmation state machine not needed for V1 read-only mode Files NOT to touch: - `app/basketball.py` — Keycloak client stays as-is - `app/groupme.py` — webhook handler stays as-is (it calls process_message which we're changing internally) ### Tool Schemas **search_player:** ``` Input: { "name": "string" } Output: "Aleiyah Apaisa — Parent: Sandra Apaisa — Phone: (385) 232-9501 — Email: apaisasandra@gmail.com — Team: 17U Elite Queens" (multiple results separated by newlines if partial match returns multiple) ``` **list_roster:** ``` Input: { "team_name": "string" } Output: "17U Elite Queens (9 players):\n1. Aleiyah Apaisa — SF — Parent: Sandra Apaisa\n2. Ayvah Apaisa — ..." ``` **payment_status:** ``` Input: { "name": "string" (optional — omit for full overview) } Output: "Sandra Apaisa: Active (paid through April)\n" or "Overview: 49/58 current (84%), 9 overdue" ``` **program_stats:** ``` Input: {} (no params) Output: "Registered: 58 | Active: 49 | Teams: 7 | Overdue: 9 | Collection rate: 84%" ``` ### Acceptance Criteria - [ ] Nemo responds to GroupMe messages using qwen3.5:4b via cluster Ollama - [ ] "What's the phone number for Apaisa?" returns Sandra Apaisa's phone (385) 232-9501 - [ ] "Who's on the 17U Elite Queens?" returns the roster - [ ] "How many players are registered?" returns program stats - [ ] No write operations available (V1 read-only) - [ ] Conversation history still works (multi-turn context) - [ ] Anthropic SDK fully removed from dependencies - [ ] Error handling: if Ollama is unreachable, respond "I'm having trouble connecting right now" ### Test Expectations - [ ] Unit test: search_player returns formatted string for known player - [ ] Unit test: list_roster returns formatted roster for known team - [ ] Unit test: program_stats returns formatted dashboard numbers - [ ] Integration test: process_message with mocked Ollama returns tool call + formatted response - Run command: `pytest tests/` ### Constraints - Use Ollama HTTP API directly (httpx POST to /api/chat) — avoid adding heavy dependencies - Tool handlers call BasketballClient (existing Keycloak-authed HTTP client) — do NOT add direct Postgres access - Keep response length under 1000 chars for GroupMe - System prompt should include 2-3 examples of questions → tool mappings for qwen3.5:4b - Tools return pre-formatted strings — model should NOT need to parse JSON ### Checklist - [ ] PR opened - [ ] Tests pass - [ ] No unrelated changes - [ ] Anthropic SDK removed from dependencies ### Related - `project-westside-ai-assistant` - Supersedes: #27 (read-only SQL tool) - Depends on: #30 (qwen3.5:4b deployed in cluster + env vars set) - Future V2: extract tools to basketball-mcp server when second consumer exists
Author
Owner

Scope Review: NEEDS_REFINEMENT

Review note: review-749-2026-04-03
Template complete but scope is borderline oversized.

  • [BODY] Add app/confirmation.py to File Targets as "File to delete"
  • [BODY] Fix "requirements.txt / pyproject.toml" → just "requirements.txt" (no pyproject.toml in repo)
  • [DECOMPOSE] 5 file ops, 8 AC — recommend split into (1) SDK swap + config, (2) purpose-built tools
  • [SCOPE] Create architecture note arch-A2
## Scope Review: NEEDS_REFINEMENT Review note: `review-749-2026-04-03` Template complete but scope is borderline oversized. - [BODY] Add `app/confirmation.py` to File Targets as "File to delete" - [BODY] Fix "requirements.txt / pyproject.toml" → just "requirements.txt" (no pyproject.toml in repo) - [DECOMPOSE] 5 file ops, 8 AC — recommend split into (1) SDK swap + config, (2) purpose-built tools - [SCOPE] Create architecture note arch-A2
Author
Owner

Re-review: APPROVED (review note: review-749-2026-04-03-r2)

Fixes applied to issue body:

  1. Added app/confirmation.py to File Targets as "Files to delete" — was only in Context before
  2. Fixed requirements.txt / pyproject.tomlrequirements.txt (no pyproject.toml in this repo)
  3. Made #30 dependency explicit in Related section

Decomposition decision: keeping as one ticket — 8 AC are all testable, tool schemas are pre-defined, and splitting would force a second ticket to re-modify ai.py. One logical feature, one agent pass.

Remaining discovered scope: arch-A2 note missing (tracked separately, not a blocker).

Ticket is ready for dispatch.

**Re-review: APPROVED** (review note: `review-749-2026-04-03-r2`) Fixes applied to issue body: 1. Added `app/confirmation.py` to File Targets as "Files to delete" — was only in Context before 2. Fixed `requirements.txt / pyproject.toml` → `requirements.txt` (no pyproject.toml in this repo) 3. Made #30 dependency explicit in Related section Decomposition decision: **keeping as one ticket** — 8 AC are all testable, tool schemas are pre-defined, and splitting would force a second ticket to re-modify ai.py. One logical feature, one agent pass. Remaining discovered scope: arch-A2 note missing (tracked separately, not a blocker). Ticket is ready for dispatch.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo_admin/westside-ai-assistant#29
No description provided.