Infra: deploy qwen3.5:4b to cluster Ollama and update Nemo env vars #30
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Feature
Lineage
Supersedes #16 (Anthropic/Ollama feature flag — no longer want a flag, want full replacement). Related to Nemo being dead since 2026-03-31 due to Anthropic credit exhaustion.
Repo
Multiple —
forgejo_admin/pal-e-deployments(kustomize),forgejo_admin/pal-e-platform(NetworkPolicy if needed)User Story
As an admin (Marcus)
I want Nemo to run on a local model that doesn't depend on paid API credits
So that the AI assistant is always available without billing constraints
Context
Nemo has been dead since March 31 because the Anthropic API key has $0 credits. The pod is running but every message returns "Sorry, I'm having trouble right now." The cluster Ollama has
qwen2.5:7bbut we wantqwen3.5:4b(3.4 GB, downloaded locally on archbox, better reasoning per parameter). This ticket prepares the infrastructure so the app ticket (#29) can swap the SDK.File Targets
Files to modify:
pal-e-deployments/overlays/westside-ai-assistant/prod/deployment-patch.yaml— removeANTHROPIC_MODELenv var, addOLLAMA_BASE_URL=http://ollama.ollama.svc:11434andOLLAMA_MODEL=qwen3.5:4bFiles to check (modify only if needed):
pal-e-platform/terraform/network-policies.tf— verify or add egress rule: westside-ai-assistant → ollama namespace on port 11434Manual steps (not file changes):
kubectl exec -n ollama deploy/ollama -- ollama pull qwen3.5:4b— pull model to clusterAcceptance Criteria
kubectl exec -n ollama deploy/ollama -- ollama listshowsqwen3.5:4bOLLAMA_BASE_URLandOLLAMA_MODELenv varsANTHROPIC_MODELenv var removed from deployment patchhttp://ollama.ollama.svc:11434/api/tagssuccessfullyTest Expectations
kubectl exec -n westside-ai-assistant deploy/westside-ai-assistant -- curl -s http://ollama.ollama.svc:11434/api/tagsreturns JSON with qwen3.5:4b listedConstraints
tofu plan -lock=falseChecklist
Related
project-westside-ai-assistantScope Review: APPROVED
Review note:
review-748-2026-04-03Ticket is well-scoped. All file targets verified (deployment-patch.yaml exists with ANTHROPIC_MODEL, NetworkPolicy rule already in main from #246). 5 AC, <5 min estimate. No decomposition needed.