Swap Anthropic SDK for Ollama (local qwen2.5:7b) #14

Closed
opened 2026-03-28 23:31:46 +00:00 by forgejo_admin · 0 comments

Type

Feature

Lineage

Replaces Anthropic API dependency. Lucas has no Anthropic tokens — must use local Ollama instance with qwen2.5:7b in the cluster.

Repo

forgejo_admin/westside-ai-assistant

User Story

As Marcus (admin)
I want the AI assistant to use the local Ollama model instead of Anthropic
So that the service runs at zero cost using our cluster GPU

Context

The AI engine was built with the Anthropic Python SDK (anthropic). We're swapping to the OpenAI Python SDK (openai) pointing at the cluster-internal Ollama endpoint, which exposes an OpenAI-compatible /v1/chat/completions API with tool_use support. Model: qwen2.5:7b running on the GTX 1070 in the Ollama pod. The tool definitions stay the same shape (OpenAI tool format), the system prompt stays the same, conversation history stays the same. The main changes are the SDK swap and response parsing.

Ollama cluster-internal URL: http://ollama.ollama.svc.cluster.local:11434/v1

File Targets

Files the agent should modify:

  • app/ai.py — swap anthropic SDK → openai SDK. Change client init to point at Ollama URL. Change tool definitions from Anthropic format to OpenAI format. Change response parsing from content_block.type == "tool_use" to choice.message.tool_calls. Change tool result submission to OpenAI format. Keep system prompt, conversation history, confirmation signals, and GroupMe posting unchanged.
  • app/config.py — replace anthropic_api_key and anthropic_model with ollama_url (default: http://ollama.ollama.svc.cluster.local:11434/v1) and ollama_model (default: qwen2.5:7b). Keep anthropic_api_key as optional fallback.
  • requirements.txt — add openai>=1.0 (keep anthropic for now as optional)
  • tests/test_ai.py — update mocks from Anthropic response format to OpenAI response format

Files the agent should NOT touch:

  • app/basketball.py — unchanged, same interface
  • app/confirmation.py — unchanged, same interface
  • app/groupme.py — unchanged, same interface

Acceptance Criteria

  • app/ai.py uses openai.OpenAI(base_url=settings.ollama_url, api_key="ollama") client
  • Tool definitions use OpenAI format: {"type": "function", "function": {"name": ..., "parameters": ...}}
  • Response parsing handles message.tool_calls (OpenAI format)
  • Tool results submitted as {"role": "tool", "tool_call_id": ..., "content": ...}
  • Model defaults to qwen2.5:7b via OLLAMA_MODEL env var
  • All existing tests pass (updated for new mock format)
  • System prompt, conversation history, confirmation flow all unchanged

Test Expectations

  • Unit test: mock OpenAI-format response with tool_calls, verify basketball function called
  • Unit test: mock read tool → immediate execution
  • Unit test: mock write tool → confirmation signal returned
  • Unit test: conversation history still capped at 20
  • Run command: pytest tests/ -v

Constraints

  • Use openai Python SDK, NOT raw HTTP — cleaner tool_use handling
  • Ollama's OpenAI endpoint: http://ollama.ollama.svc.cluster.local:11434/v1
  • API key for Ollama is any non-empty string (e.g. "ollama")
  • qwen2.5:7b tool_use is less reliable than Claude — system prompt should be explicit about tool selection
  • Keep Anthropic as optional fallback: if ANTHROPIC_API_KEY is set and OLLAMA_URL is not, use Anthropic

Checklist

  • PR opened
  • Tests pass
  • No unrelated changes
  • project-westside-ai-assistant — parent project
  • story-westside-ai-assistant-read-ops — must still work with local model
### Type Feature ### Lineage Replaces Anthropic API dependency. Lucas has no Anthropic tokens — must use local Ollama instance with qwen2.5:7b in the cluster. ### Repo `forgejo_admin/westside-ai-assistant` ### User Story As Marcus (admin) I want the AI assistant to use the local Ollama model instead of Anthropic So that the service runs at zero cost using our cluster GPU ### Context The AI engine was built with the Anthropic Python SDK (`anthropic`). We're swapping to the OpenAI Python SDK (`openai`) pointing at the cluster-internal Ollama endpoint, which exposes an OpenAI-compatible `/v1/chat/completions` API with tool_use support. Model: `qwen2.5:7b` running on the GTX 1070 in the Ollama pod. The tool definitions stay the same shape (OpenAI tool format), the system prompt stays the same, conversation history stays the same. The main changes are the SDK swap and response parsing. Ollama cluster-internal URL: `http://ollama.ollama.svc.cluster.local:11434/v1` ### File Targets Files the agent should modify: - `app/ai.py` — swap `anthropic` SDK → `openai` SDK. Change client init to point at Ollama URL. Change tool definitions from Anthropic format to OpenAI format. Change response parsing from `content_block.type == "tool_use"` to `choice.message.tool_calls`. Change tool result submission to OpenAI format. Keep system prompt, conversation history, confirmation signals, and GroupMe posting unchanged. - `app/config.py` — replace `anthropic_api_key` and `anthropic_model` with `ollama_url` (default: `http://ollama.ollama.svc.cluster.local:11434/v1`) and `ollama_model` (default: `qwen2.5:7b`). Keep `anthropic_api_key` as optional fallback. - `requirements.txt` — add `openai>=1.0` (keep `anthropic` for now as optional) - `tests/test_ai.py` — update mocks from Anthropic response format to OpenAI response format Files the agent should NOT touch: - `app/basketball.py` — unchanged, same interface - `app/confirmation.py` — unchanged, same interface - `app/groupme.py` — unchanged, same interface ### Acceptance Criteria - [ ] `app/ai.py` uses `openai.OpenAI(base_url=settings.ollama_url, api_key="ollama")` client - [ ] Tool definitions use OpenAI format: `{"type": "function", "function": {"name": ..., "parameters": ...}}` - [ ] Response parsing handles `message.tool_calls` (OpenAI format) - [ ] Tool results submitted as `{"role": "tool", "tool_call_id": ..., "content": ...}` - [ ] Model defaults to `qwen2.5:7b` via `OLLAMA_MODEL` env var - [ ] All existing tests pass (updated for new mock format) - [ ] System prompt, conversation history, confirmation flow all unchanged ### Test Expectations - [ ] Unit test: mock OpenAI-format response with tool_calls, verify basketball function called - [ ] Unit test: mock read tool → immediate execution - [ ] Unit test: mock write tool → confirmation signal returned - [ ] Unit test: conversation history still capped at 20 - Run command: `pytest tests/ -v` ### Constraints - Use `openai` Python SDK, NOT raw HTTP — cleaner tool_use handling - Ollama's OpenAI endpoint: `http://ollama.ollama.svc.cluster.local:11434/v1` - API key for Ollama is any non-empty string (e.g. `"ollama"`) - qwen2.5:7b tool_use is less reliable than Claude — system prompt should be explicit about tool selection - Keep Anthropic as optional fallback: if `ANTHROPIC_API_KEY` is set and `OLLAMA_URL` is not, use Anthropic ### Checklist - [ ] PR opened - [ ] Tests pass - [ ] No unrelated changes ### Related - `project-westside-ai-assistant` — parent project - `story-westside-ai-assistant-read-ops` — must still work with local model
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo_admin/westside-ai-assistant#14
No description provided.