feat: olla mocking skills for claude#174
Conversation
…haiku for token efficiency
WalkthroughThis PR adds an agent-driven end-to-end validation harness: SKILL orchestration, per-area validation checklists, a multi-protocol mock LLM backend (ollamock) with runtime fault injection and streaming support, test suite and harness configs, and documentation + mkdocs navigation. ChangesEnd-to-End Validation Harness
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
test/cmd/ollamock/streaming.go (2)
580-601: 💤 Low valueConsider using context.Context directly rather than interface.
The functions
applyTTFTandapplyTPSacceptinterface{ Done() <-chan struct{} }to access theDone()channel. Sincecontext.Contextis the standard interface for cancellation and all callers passr.Context(), usingcontext.Contextdirectly would be clearer and more idiomatic.♻️ Proposed change
-func applyTTFT(ctx interface{ Done() <-chan struct{} }, ttftMS int) { +func applyTTFT(ctx context.Context, ttftMS int) { if ttftMS <= 0 { return } select { case <-ctx.Done(): case <-time.After(time.Duration(ttftMS) * time.Millisecond): } } -func applyTPS(ctx interface{ Done() <-chan struct{} }, tps int) { +func applyTPS(ctx context.Context, tps int) { if tps <= 0 { return } delay := time.Duration(1000/tps) * time.Millisecond select { case <-ctx.Done(): case <-time.After(delay): } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/cmd/ollamock/streaming.go` around lines 580 - 601, Change the parameter type for applyTTFT and applyTPS from the generic interface{ Done() <-chan struct{} } to the concrete context.Context: update func signatures applyTTFT(ctx context.Context, ttftMS int) and applyTPS(ctx context.Context, tps int), import the context package, and ensure all callers passing r.Context() keep working (no other behavior changes needed since context.Context exposes Done()); this makes the code idiomatic and clearer about cancellation semantics.
18-32: ⚡ Quick winRemove dead code and misleading comment.
Lines 20–26 contain unused code (two decoder declarations that are never used). The comment on lines 27–28 mentions "re-parse permissively" but there's only a single parse operation. This appears to be leftover experimental code.
♻️ Proposed cleanup
func parseInferenceRequest(r *http.Request) (inferenceRequest, error) { var req inferenceRequest - dec := json.NewDecoder(r.Body) - dec.DisallowUnknownFields() - - // We only need the two top-level fields; unknown fields come from real - // clients that send messages, tools, temperature etc. Allow them silently. - dec2 := json.NewDecoder(r.Body) - _ = dec2 - // Re-parse permissively — DisallowUnknownFields was too strict for real - // client payloads. Use a map-based approach instead. if err := json.NewDecoder(r.Body).Decode(&req); err != nil && err != io.EOF { return inferenceRequest{}, err } return req, nil }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/cmd/ollamock/streaming.go` around lines 18 - 32, In parseInferenceRequest: remove the dead/unused variables dec and dec2 and the misleading comment about "re-parse permissively"; instead keep a single permissive decode using json.NewDecoder(r.Body).Decode(&req) (with the same err != io.EOF handling) and drop the DisallowUnknownFields call so the function only performs one decode pass into inferenceRequest.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.claude/skills/olla-validate/areas/observability.md:
- Around line 13-15: Update the endpoint name list used by the GET
/internal/status/endpoints check to match the SKILL.md topology: replace
mock-vllm-a with mock-vllm-e, mock-litellm-b with mock-litellm-f, and
mock-llamacpp-d with mock-llamacpp-g so the expected seven endpoints
(mock-openai-a/b, mock-vllm-e, mock-litellm-f, mock-ollama-c, mock-lmstudio-d,
mock-llamacpp-g) match the actual /internal/status/endpoints output.
In @.claude/skills/olla-validate/areas/resilience.md:
- Around line 89-91: Update the "Final state assertion" text so the mock count
matches the topology: replace the phrase "all four mocks" with "all seven mocks"
(the assertion that all mocks report default behaviour via `GET
/_mock/behaviour` and that all 7 endpoints are healthy and `/internal/status`
returns 200 must check seven mocks). Ensure any nearby references in the same
paragraph to mock count are updated to "seven" so the final assertion
consistently verifies all seven mocks.
In @.claude/skills/olla-validate/SKILL.md:
- Line 178: Documentation and test instructions incorrectly reference "four
mocks" for the reset and verification steps; update all occurrences of the reset
instruction and final assertion that say "four mocks" (e.g., the POST
/_mock/reset step and the final confirmation sentence) to reflect seven mocks
(ports 19431–19437) so both the orchestration and area validation steps reset
and verify all seven mocks instead of four. Ensure the wording for POST
/_mock/reset and the final assertions explicitly state "seven mocks (ports
19431–19437)".
---
Nitpick comments:
In `@test/cmd/ollamock/streaming.go`:
- Around line 580-601: Change the parameter type for applyTTFT and applyTPS from
the generic interface{ Done() <-chan struct{} } to the concrete context.Context:
update func signatures applyTTFT(ctx context.Context, ttftMS int) and
applyTPS(ctx context.Context, tps int), import the context package, and ensure
all callers passing r.Context() keep working (no other behavior changes needed
since context.Context exposes Done()); this makes the code idiomatic and clearer
about cancellation semantics.
- Around line 18-32: In parseInferenceRequest: remove the dead/unused variables
dec and dec2 and the misleading comment about "re-parse permissively"; instead
keep a single permissive decode using json.NewDecoder(r.Body).Decode(&req) (with
the same err != io.EOF handling) and drop the DisallowUnknownFields call so the
function only performs one decode pass into inferenceRequest.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fa233c9a-e858-43ca-97c8-30aa4102245a
📒 Files selected for processing (19)
.claude/skills/olla-validate/SKILL.md.claude/skills/olla-validate/areas/anthropic.md.claude/skills/olla-validate/areas/core-routing.md.claude/skills/olla-validate/areas/limits-failures.md.claude/skills/olla-validate/areas/observability.md.claude/skills/olla-validate/areas/openai-api.md.claude/skills/olla-validate/areas/resilience.mdCLAUDE.mddocs/content/development/testing.mddocs/content/development/validation.mddocs/mkdocs.ymltest/cmd/ollamock/README.mdtest/cmd/ollamock/behaviour.gotest/cmd/ollamock/handlers.gotest/cmd/ollamock/main.gotest/cmd/ollamock/ollamock_test.gotest/cmd/ollamock/streaming.gotest/validate/config.validate.limits.yamltest/validate/config.validate.yaml
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.claude/skills/olla-validate/SKILL.md (1)
188-190: 💤 Low valueAwkward phrasing: "return to healthy".
"return to healthy" is grammatically awkward. Consider "return to health" or "become healthy again".
✏️ Suggested rewrite
-After wave 2: reset all mock behaviours, re-confirm all endpoints return to -healthy within 60s (this is itself a recovery assertion - record it; health +After wave 2: reset all mock behaviours, re-confirm all endpoints return to +health within 60s (this is itself a recovery assertion - record it; health probes tick globally every 30s regardless of per-endpoint check_interval).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.claude/skills/olla-validate/SKILL.md around lines 188 - 190, The phrase "return to healthy" in the sentence starting with "After wave 2: reset all mock behaviours, re-confirm all endpoints return to healthy within 60s..." is awkward; change it to a clearer phrasing such as "return to health", "become healthy again", or "are healthy again" and update that sentence in SKILL.md (the line beginning "After wave 2: reset all mock behaviours...") so it reads smoothly while preserving the timing/health-probe details.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In @.claude/skills/olla-validate/SKILL.md:
- Around line 188-190: The phrase "return to healthy" in the sentence starting
with "After wave 2: reset all mock behaviours, re-confirm all endpoints return
to healthy within 60s..." is awkward; change it to a clearer phrasing such as
"return to health", "become healthy again", or "are healthy again" and update
that sentence in SKILL.md (the line beginning "After wave 2: reset all mock
behaviours...") so it reads smoothly while preserving the timing/health-probe
details.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 71acfc22-8d7a-4ec5-b1ea-b787ba13b64b
📒 Files selected for processing (14)
.claude/skills/olla-validate/SKILL.md.claude/skills/olla-validate/areas/anthropic.md.claude/skills/olla-validate/areas/core-routing.md.claude/skills/olla-validate/areas/limits-failures.md.claude/skills/olla-validate/areas/observability.md.claude/skills/olla-validate/areas/openai-api.md.claude/skills/olla-validate/areas/resilience.mdAGENTS.mdCLAUDE.mdtest/cmd/ollamock/README.mdtest/cmd/ollamock/behaviour.gotest/cmd/ollamock/handlers.gotest/cmd/ollamock/ollamock_test.gotest/cmd/ollamock/streaming.go
✅ Files skipped from review due to trivial changes (6)
- AGENTS.md
- .claude/skills/olla-validate/areas/anthropic.md
- .claude/skills/olla-validate/areas/observability.md
- .claude/skills/olla-validate/areas/core-routing.md
- test/cmd/ollamock/README.md
- .claude/skills/olla-validate/areas/resilience.md
🚧 Files skipped from review as they are similar to previous changes (4)
- .claude/skills/olla-validate/areas/openai-api.md
- test/cmd/ollamock/ollamock_test.go
- test/cmd/ollamock/behaviour.go
- test/cmd/ollamock/streaming.go
Our nightly / regression scripts used in tensorfoundry.io products for testing ported to Olla - and tweaked.
/olla-validate --quick- 5–10 min gate after major changes/olla-validate --nightly- multi-hour pre-release gate (chaos, soak, Sherpa pass, forced-translation pass, benchmarks)Summary by CodeRabbit
Documentation
New Features
Tests
Chores