feat(ai-lakera-guard): scan LLM responses (direction output/both, non-streaming + streaming)#13606
feat(ai-lakera-guard): scan LLM responses (direction output/both, non-streaming + streaming)#13606janiussyafiq wants to merge 7 commits into
Conversation
…t and both directions; update documentation and tests
…ce test coverage for output direction
… mode and fail-open support; update documentation and tests
…sts for fail-open behavior; update documentation
…d readability and maintainability
| if not ctx.var.llm_request_done then | ||
| -- Withhold this chunk until end-of-stream, replacing it with an SSE | ||
| -- keep-alive comment. Not "" (nginx treats an empty body as nothing | ||
| -- to flush) and not nil (which would let the original chunk reach | ||
| -- the client) -- the keep-alive holds the content back while keeping | ||
| -- the connection open. | ||
| return nil, ":\n\n" | ||
| end | ||
|
|
||
| local text = ctx.var.llm_response_text | ||
| if not text or text == "" then | ||
| if conf.fail_open then | ||
| core.log.warn("ai-lakera-guard: streamed response ended without ", | ||
| "an assembled completion (no upstream usage event?); ", | ||
| "fail_open=true, releasing unscanned") | ||
| return nil, concat(buffer) | ||
| end | ||
| core.log.error("ai-lakera-guard: streamed response ended without ", | ||
| "an assembled completion (no upstream usage event?); ", | ||
| "fail_open=false, blocking response") | ||
| return ngx.OK, deny_message(ctx, conf, conf.response_failure_message) | ||
| end | ||
|
|
||
| local code, message = moderate_response(ctx, conf, text) | ||
| if code then | ||
| return ngx.OK, message | ||
| end | ||
|
|
||
| -- Clean: release the buffered stream verbatim, preserving SSE framing. | ||
| return nil, concat(buffer) |
There was a problem hiding this comment.
Block-mode streaming can duplicate or drop the response when a protocol converter is active (e.g. Anthropic client → OpenAI upstream via ai-proxy-multi). The same-protocol path tested here is fine.
Root cause: release is gated only on llm_request_done. In ai-providers/base.lua that flag is set before the dispatch loop, and with a converter the loop calls lua_body_filter once per converted chunk. On [DONE], anthropic-messages-to-openai-chat emits two chunks (message_delta + message_stop) — both dispatched with llm_request_done == true.
Two symptoms:
- Upstream without
include_usage: both terminal chunks hitreturn nil, concat(buffer)→ response sent twice + Lakera scanned twice. - Upstream with
include_usage: the converter defers the terminal events to the usage chunk (typeusage, doesn't setllm_request_done), then[DONE]converts to nothing → buffer never released, clean response dropped (client gets only keep-alives).
Fix: gate scan+release behind a one-shot flag (e.g. ctx.lakera_response_released) so it runs exactly once, and ensure the buffer is flushed at EOF even when the final dispatch produced no chunk (an end-of-stream filter pass in base.lua, like the abort path this PR adds). A block-mode ai-proxy-multi regression test (Anthropic client + OpenAI streaming upstream) would lock it down.
…for blocking and clean responses; add tests for output direction
…prove fail-closed behavior; add tests for streaming scenarios
ab6cc45
Description
PR-2 of
ai-lakera-guard, following the input-scanning MVP (#13570). Adds response (output) scanning for non-streaming and streaming (SSE) traffic. Back-compatible:directionstill defaults toinput.directionextended toinput/output/both; addsresponse_failure_message.lua_body_filter, the same dispatchai-aliyun-content-moderationuses):ctx.var.llm_response_text; a flagged response is replaced with a provider-compatible deny.[DONE](flagged). Because the stream's200/text/event-streamheaders are already committed when buffering begins, a streamed block is delivered as the deny body —deny_codedoes not apply to streams.Which issue(s) this PR fixes:
Part of #13291.
Checklist