feat(ai-lakera-guard): scan LLM responses (direction output/both, non-streaming + streaming) by janiussyafiq · Pull Request #13606 · apache/apisix

janiussyafiq · 2026-06-25T03:56:13Z

Description

PR-2 of ai-lakera-guard, following the input-scanning MVP (#13570). Adds response (output) scanning for non-streaming and streaming (SSE) traffic. Back-compatible: direction still defaults to input.

Schema: direction extended to input/output/both; adds response_failure_message.
Response path (lua_body_filter, the same dispatch ai-aliyun-content-moderation uses):
- Non-streaming: scans ctx.var.llm_response_text; a flagged response is replaced with a provider-compatible deny.
- Streaming: buffers the SSE response, scans the assembled completion once, then releases it verbatim (clean) or replaces it with a deny SSE terminated by [DONE] (flagged). Because the stream's 200/text/event-stream headers are already committed when buffering begins, a streamed block is delivered as the deny body — deny_code does not apply to streams.
Docs (en + zh) and tests added.

Which issue(s) this PR fixes:

Part of #13291.

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change
I have verified that this change is backward compatible

…t and both directions; update documentation and tests

…ce test coverage for output direction

… mode and fail-open support; update documentation and tests

…sts for fail-open behavior; update documentation

…d readability and maintainability

AlinsRan · 2026-06-26T02:44:03Z

+        if not ctx.var.llm_request_done then
+            -- Withhold this chunk until end-of-stream, replacing it with an SSE
+            -- keep-alive comment. Not "" (nginx treats an empty body as nothing
+            -- to flush) and not nil (which would let the original chunk reach
+            -- the client) -- the keep-alive holds the content back while keeping
+            -- the connection open.
+            return nil, ":\n\n"
+        end
+
+        local text = ctx.var.llm_response_text
+        if not text or text == "" then
+            if conf.fail_open then
+                core.log.warn("ai-lakera-guard: streamed response ended without ",
+                              "an assembled completion (no upstream usage event?); ",
+                              "fail_open=true, releasing unscanned")
+                return nil, concat(buffer)
+            end
+            core.log.error("ai-lakera-guard: streamed response ended without ",
+                           "an assembled completion (no upstream usage event?); ",
+                           "fail_open=false, blocking response")
+            return ngx.OK, deny_message(ctx, conf, conf.response_failure_message)
+        end
+
+        local code, message = moderate_response(ctx, conf, text)
+        if code then
+            return ngx.OK, message
+        end
+
+        -- Clean: release the buffered stream verbatim, preserving SSE framing.
+        return nil, concat(buffer)


Block-mode streaming can duplicate or drop the response when a protocol converter is active (e.g. Anthropic client → OpenAI upstream via ai-proxy-multi). The same-protocol path tested here is fine.

Root cause: release is gated only on llm_request_done. In ai-providers/base.lua that flag is set before the dispatch loop, and with a converter the loop calls lua_body_filter once per converted chunk. On [DONE], anthropic-messages-to-openai-chat emits two chunks (message_delta + message_stop) — both dispatched with llm_request_done == true.

Two symptoms:

Upstream without include_usage: both terminal chunks hit return nil, concat(buffer) → response sent twice + Lakera scanned twice.

Upstream with include_usage: the converter defers the terminal events to the usage chunk (type usage, doesn't set llm_request_done), then [DONE] converts to nothing → buffer never released, clean response dropped (client gets only keep-alives).

Fix: gate scan+release behind a one-shot flag (e.g. ctx.lakera_response_released) so it runs exactly once, and ensure the buffer is flushed at EOF even when the final dispatch produced no chunk (an end-of-stream filter pass in base.lua, like the abort path this PR adds). A block-mode ai-proxy-multi regression test (Anthropic client + OpenAI streaming upstream) would lock it down.

…for blocking and clean responses; add tests for output direction

…prove fail-closed behavior; add tests for streaming scenarios

janiussyafiq added 2 commits June 24, 2026 17:03

feat(ai-lakera-guard): enhance scanning capabilities to support outpu…

4b535f9

…t and both directions; update documentation and tests

feat(ai-lakera-guard): implement multi-chunk streaming mock and enhan…

caf8500

…ce test coverage for output direction

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Jun 25, 2026