fix: multipart parsing for HTTP/2 support in ClamAV plugin#147
fix: multipart parsing for HTTP/2 support in ClamAV plugin#147rayshoo wants to merge 1 commit intobunkerity:mainfrom
Conversation
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
ClamAV pluginSecurity impact: Fixes HTTP/2 multipart file upload handling to prevent scans being bypassed. Previous implementation used streaming multipart parsing incompatible with HTTP/2 request structure; redesigned to manually parse multipart sections and scan each file part via separate TCP connections using the INSTREAM protocol with 4-byte big-endian length framing. Plugin version: Bumped to 1.9 in Version drift: ClamAV 1.9 is NOT recorded in Code changes:
Settings: No schema changes; configuration remains: Testing: Functional test suite ( Documentation: README.md provides setup instructions for Docker, Swarm, and Kubernetes deployments with environment variable examples. TODO items noted (clustered mode testing, custom ClamAV configuration, Linux integration documentation). Note: Full implementation assisted by ChatGPT and Claude AI per author disclosure. WalkthroughThe scanning mechanism has been redesigned to replace streaming multipart upload handling with a buffered approach: extract the multipart boundary from the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@clamav/clamav.lua`:
- Around line 152-171: The extract_filename function (clamav:extract_filename)
should sanitize the extracted filename before returning it for logging:
strip/control characters (e.g., remove characters with byte < 32 and CR/LF),
trim surrounding whitespace, enforce a maximum length (e.g., 255 chars) and, if
resulting name is empty, return nil or a safe placeholder; also ensure the
method does not trigger an unused-self warning by explicitly acknowledging self
(either use function clamav.extract_filename(self, headers) and reference local
_ = self or change the signature to (_, headers)). Update the function to
perform these sanitisation steps and return the cleaned filename for use in
logs.
- Around line 173-186: In clamav:api replace the unused variable from the
has_variable call by either discarding it or using it: change the call site
where you currently do "local enabled, err = has_variable(...)" to "local
enabled, _ = has_variable(...)" to avoid the unused-variable warning, or
alternatively capture and log the error (e.g. use process or self logger to log
err) before returning; target the has_variable call inside the clamav:api
function to implement the change.
- Line 235: The loop uses an unused index variable `i` in the `for i, part in
ipairs(parts) do` iteration, which triggers static-analysis warnings; replace it
with the conventional `_` placeholder (i.e., `for _, part in ipairs(parts) do`)
inside the same scope (in the file clamav/clamav.lua where that loop appears)
and confirm there are no references to `i` within the loop body—if any exist,
either restore a meaningful name or use the actual name where needed instead of
`_`.
- Around line 23-31: In clamav:socket(), the CLAMAV_TIMEOUT fallback (used in
sock:settimeout) is 5000ms but plugin.json default is 1000ms; change the call to
apply tonumber only to self.variables["CLAMAV_TIMEOUT"] so the fallback of 1000
is used (e.g., use tonumber(self.variables["CLAMAV_TIMEOUT"]) or 1000) and pass
that to sock:settimeout to align the default timeout; update the CLAMAV_TIMEOUT
reference accordingly in clamav:socket().
- Around line 275-277: The log at self.logger:log(ERR, "[clamav] ClamAV result
for " .. part.filename .. ": " .. tostring(result)) is using the ERR level for
normal/clean scan results; change it to use the NOTICE level (which is declared
but unused) or ngx.INFO for successful scans, and reserve ERR only for malware
detections or explicit errors (i.e., update the logging call in the ClamAV
result handling to use NOTICE/ngx.INFO for clean results and log at ERR only
when result indicates malware or an error).
- Around line 116-147: The loop declares an unused variable `i` in `for i,
section in ipairs(sections) do`; change it to use the conventional
unused-variable name by replacing `i` with `_` (i.e., `for _, section in
ipairs(sections) do`) so static analysis no longer flags the unused binding,
leaving the rest of the loop body (checks, header parsing, extract_filename
call, and table.insert into `parts`) unchanged.
- Around line 267-270: Check the return value of sock:send when sending the
stream termination marker in the function that calls
sock:send(string.char(0,0,0,0)); if send returns nil (failure, err) handle it by
closing the socket (sock:close()), propagate or return an error instead of
proceeding to sock:receive, and only call sock:receive("*l") when send
succeeded; update any callers that rely on the local result variable to handle
the error path accordingly.
- Around line 48-67: The extract_boundary function has an unreachable fallback
pattern and triggers an unused `self` linter warning; replace the two-step match
with a single pattern that handles both quoted and unquoted values (e.g. match
'boundary=%s*"?([^";]+)"?' to capture the boundary without quotes) and then
return "--" .. boundary when found, and suppress the unused `self` warning by
either renaming the first parameter to _ (function clamav:extract_boundary(_)
...) or adding a luacheck directive (--luacheck: ignore 212/self) at the top of
the function.
- Around line 69-113: In function parse_multipart: remove the redundant initial
assignment current_pos = 1 (it's immediately overwritten after finding
first_boundary_pos) and introduce a named constant for the magic number used to
filter tiny sections (replace the literal 10 with e.g. local MIN_SECTION_LEN =
10 declared near the top of the function and use MIN_SECTION_LEN in the two
checks), optionally add a short comment documenting why that minimum length
exists; reference parse_multipart, current_pos, first_boundary_pos, and the two
section length checks.
- Around line 10-12: Remove the unused local constants NOTICE and HTTP_FORBIDDEN
from the top of clamav.lua, or if they were intended to be used, replace the
current ngx.ERR logging used for scan result messages (the scan-result logging
site currently using ERR) with ngx.log(NOTICE, ...) and replace any hard-coded
status usages with the HTTP_FORBIDDEN constant where appropriate; ensure only
the symbols NOTICE and HTTP_FORBIDDEN that are actually referenced elsewhere
(e.g., scan-result logging and forbidden responses) remain declared, otherwise
delete their declarations.
- Around line 212-229: The code buffers the entire request body into memory (the
block that reads ngx.req.get_body_data() and falls back to
ngx.req.get_body_file()) without size validation; add a configurable
CLAMAV_MAX_BODY_SIZE (default 50 * 1024 * 1024) and after the body is populated
but before calling parse_multipart() check the byte length (use `#body` or
equivalent) and return an error via self:ret(true, "Request body too large") and
log via self.logger:log(ERR, ...) when the body exceeds the limit; ensure the
new config key is read where the plugin config is loaded and referenced in this
function so deployments can override the default.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d9160d39-9a31-486f-a2d7-402d5d82030a
📒 Files selected for processing (1)
clamav/clamav.lua
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
**/*.lua
⚙️ CodeRabbit configuration file
**/*.lua: Lua code runs on OpenResty inside the BunkerWeb nginx container and sits on the request hot path:
- Every plugin subclasses
bunkerweb.pluginviamiddleclass:local <name> = class("<name>", plugin)andplugin.initialize(self, "<id>", ctx)ininitialize. Hook methods (init_worker,access,log,preread, ...) must returnself:ret(ok_bool, msg, [http_status]). To deny a request, returnself:ret(true, "reason", utils.get_deny_status()).- Gate expensive work at
init_workerwithutils.has_variable("USE_<PLUGIN>", "yes")and skip whenself.is_loadingis true, matching the pattern used acrossclamav.lua,coraza.lua, andvirustotal.lua.- Use local variables and local module tables; avoid globals. Cache
ngx.var.*andngx.req.*values in locals instead of re-reading them repeatedly.- Precompile regular expressions in module-level locals; never compile inside request loops. For
ngx.re.match/find/gmatch/sub, pass the option string"jo"(jenables PCRE JIT,ocompiles the pattern once and caches it), anchor patterns with^...$when a full match is intended, and cap input length before matching to prevent ReDoS.- Validate and sanitise all request-derived input. Never evaluate request-derived code via
load,loadstring, or similar mechanisms.- Use
ngx.socketfor raw TCP (see the ClamAV INSTREAM pattern inclamav.lua) andresty.httpfor HTTP upstreams (seevirustotal.luaandcoraza.lua). Preferresty.uploadfor streaming request bodies —clamav.luais the reference.- Cache upstream scan results by a strong digest of the body (SHA-512 in
virustotal.lua) rather than by filename or weaker hashes; mind cache key length.- Shared-dictionary read-modify-write sequences are race-prone; prefer atomic operations such as
incror explicit locking where correctness matters.- Never log request bodies, cookies, bearer tokens, webhook secrets, or API keys.
- Use
pcallor explicit error...
Files:
clamav/clamav.lua
🪛 Luacheck (1.2.0)
clamav/clamav.lua
[warning] 10-10: unused variable 'NOTICE'
(W211)
[warning] 12-12: unused variable 'HTTP_FORBIDDEN'
(W211)
[warning] 49-49: unused argument 'self'
(W212)
[warning] 79-79: value assigned to variable 'current_pos' is unused
(W311)
[warning] 116-116: unused loop variable 'i'
(W213)
[warning] 153-153: unused argument 'self'
(W212)
[warning] 175-175: unused variable 'err'
(W211)
[warning] 235-235: unused loop variable 'i'
(W213)
🔇 Additional comments (2)
clamav/clamav.lua (2)
33-46: LGTM!The
ping()function correctly handles socket lifecycle and error conditions.
279-286: Clarify deny status usage and verify return semantics.Line 281 calls
get_deny_status()to return the configured deny HTTP status (typically 403). Theself:ret(true, ...)with a status code and extra data looks correct per the plugin pattern. However, the fourth parameter isnilexplicitly—confirm this aligns with thebunkerweb.plugincontract.
| local NOTICE = ngx.NOTICE | ||
| local HTTP_INTERNAL_SERVER_ERROR = ngx.HTTP_INTERNAL_SERVER_ERROR | ||
| local HTTP_FORBIDDEN = ngx.HTTP_FORBIDDEN |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Remove unused variables NOTICE and HTTP_FORBIDDEN.
Static analysis correctly flags these as unused. If NOTICE was intended for logging scan results (see line 276), it should be used there instead of ERR.
🧹 Proposed fix
-local NOTICE = ngx.NOTICE
local HTTP_INTERNAL_SERVER_ERROR = ngx.HTTP_INTERNAL_SERVER_ERROR
-local HTTP_FORBIDDEN = ngx.HTTP_FORBIDDEN
local HTTP_OK = ngx.HTTP_OK📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| local NOTICE = ngx.NOTICE | |
| local HTTP_INTERNAL_SERVER_ERROR = ngx.HTTP_INTERNAL_SERVER_ERROR | |
| local HTTP_FORBIDDEN = ngx.HTTP_FORBIDDEN | |
| local HTTP_INTERNAL_SERVER_ERROR = ngx.HTTP_INTERNAL_SERVER_ERROR | |
| local HTTP_OK = ngx.HTTP_OK |
🧰 Tools
🪛 Luacheck (1.2.0)
[warning] 10-10: unused variable 'NOTICE'
(W211)
[warning] 12-12: unused variable 'HTTP_FORBIDDEN'
(W211)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` around lines 10 - 12, Remove the unused local constants
NOTICE and HTTP_FORBIDDEN from the top of clamav.lua, or if they were intended
to be used, replace the current ngx.ERR logging used for scan result messages
(the scan-result logging site currently using ERR) with ngx.log(NOTICE, ...) and
replace any hard-coded status usages with the HTTP_FORBIDDEN constant where
appropriate; ensure only the symbols NOTICE and HTTP_FORBIDDEN that are actually
referenced elsewhere (e.g., scan-result logging and forbidden responses) remain
declared, otherwise delete their declarations.
| function clamav:socket() | ||
| local sock = ngx.socket.tcp() | ||
| sock:settimeout(tonumber(self.variables["CLAMAV_TIMEOUT"] or 5000)) | ||
| local ok, err = sock:connect(self.variables["CLAMAV_HOST"], tonumber(self.variables["CLAMAV_PORT"])) | ||
| if not ok then | ||
| return nil, err | ||
| end | ||
| return sock | ||
| end |
There was a problem hiding this comment.
Timeout fallback inconsistent with plugin.json default.
Line 25 falls back to 5000ms when CLAMAV_TIMEOUT is unset, but plugin.json specifies the default as 1000ms. Align these values.
🔧 Proposed fix
function clamav:socket()
local sock = ngx.socket.tcp()
- sock:settimeout(tonumber(self.variables["CLAMAV_TIMEOUT"] or 5000))
+ sock:settimeout(tonumber(self.variables["CLAMAV_TIMEOUT"]) or 1000)
local ok, err = sock:connect(self.variables["CLAMAV_HOST"], tonumber(self.variables["CLAMAV_PORT"]))Note: tonumber() should wrap only the variable, not the fallback, so that tonumber(nil) returns nil and triggers the or 1000 fallback.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function clamav:socket() | |
| local sock = ngx.socket.tcp() | |
| sock:settimeout(tonumber(self.variables["CLAMAV_TIMEOUT"] or 5000)) | |
| local ok, err = sock:connect(self.variables["CLAMAV_HOST"], tonumber(self.variables["CLAMAV_PORT"])) | |
| if not ok then | |
| return nil, err | |
| end | |
| return sock | |
| end | |
| function clamav:socket() | |
| local sock = ngx.socket.tcp() | |
| sock:settimeout(tonumber(self.variables["CLAMAV_TIMEOUT"]) or 1000) | |
| local ok, err = sock:connect(self.variables["CLAMAV_HOST"], tonumber(self.variables["CLAMAV_PORT"])) | |
| if not ok then | |
| return nil, err | |
| end | |
| return sock | |
| end |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` around lines 23 - 31, In clamav:socket(), the
CLAMAV_TIMEOUT fallback (used in sock:settimeout) is 5000ms but plugin.json
default is 1000ms; change the call to apply tonumber only to
self.variables["CLAMAV_TIMEOUT"] so the fallback of 1000 is used (e.g., use
tonumber(self.variables["CLAMAV_TIMEOUT"]) or 1000) and pass that to
sock:settimeout to align the default timeout; update the CLAMAV_TIMEOUT
reference accordingly in clamav:socket().
| -- Improved boundary extraction function with better quote handling | ||
| function clamav:extract_boundary(content_type) | ||
| if not content_type then | ||
| return nil | ||
| end | ||
|
|
||
| -- Handle both boundary=value and boundary="value" formats | ||
| local boundary = content_type:match('boundary=([^;%s]+)') | ||
| if not boundary then | ||
| boundary = content_type:match('boundary="([^"]+)"') | ||
| end | ||
|
|
||
| if boundary then | ||
| -- Remove quotes if present | ||
| boundary = boundary:gsub('^"', ''):gsub('"$', '') | ||
| return "--" .. boundary | ||
| end | ||
|
|
||
| return nil | ||
| end |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Simplify boundary extraction; suppress unused self warning.
The fallback pattern on line 57 is unreachable for standard quoted boundaries because line 55's pattern ([^;%s]+) captures "value" including quotes (whitespace/semicolon terminates, not quotes). Line 62 then strips them.
Consider a single, clearer pattern. For the static analysis warning, use the conventional _ prefix or --luacheck: ignore 212/self.
♻️ Proposed simplification
+-- luacheck: push ignore 212/self
function clamav:extract_boundary(content_type)
if not content_type then
return nil
end
-
- -- Handle both boundary=value and boundary="value" formats
- local boundary = content_type:match('boundary=([^;%s]+)')
- if not boundary then
- boundary = content_type:match('boundary="([^"]+)"')
- end
-
- if boundary then
- -- Remove quotes if present
- boundary = boundary:gsub('^"', ''):gsub('"$', '')
+
+ -- Capture boundary value, with or without quotes
+ local boundary = content_type:match('boundary="([^"]+)"')
+ or content_type:match('boundary=([^;%s]+)')
+ if boundary then
return "--" .. boundary
end
-
return nil
end
+-- luacheck: pop🧰 Tools
🪛 Luacheck (1.2.0)
[warning] 49-49: unused argument 'self'
(W212)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` around lines 48 - 67, The extract_boundary function has an
unreachable fallback pattern and triggers an unused `self` linter warning;
replace the two-step match with a single pattern that handles both quoted and
unquoted values (e.g. match 'boundary=%s*"?([^";]+)"?' to capture the boundary
without quotes) and then return "--" .. boundary when found, and suppress the
unused `self` warning by either renaming the first parameter to _ (function
clamav:extract_boundary(_) ...) or adding a luacheck directive (--luacheck:
ignore 212/self) at the top of the function.
| -- Enhanced multipart parsing function with proper HTTP/2 support | ||
| function clamav:parse_multipart(body, boundary) | ||
| if not body or not boundary then | ||
| return {} | ||
| end | ||
|
|
||
| local parts = {} | ||
|
|
||
| -- Split body by boundary markers | ||
| local sections = {} | ||
| local current_pos = 1 | ||
|
|
||
| -- Find first boundary | ||
| local first_boundary_pos = body:find(boundary, 1, true) | ||
| if not first_boundary_pos then | ||
| return {} | ||
| end | ||
|
|
||
| current_pos = first_boundary_pos + #boundary | ||
|
|
||
| while true do | ||
| -- Find next boundary | ||
| local next_boundary_pos = body:find(boundary, current_pos, true) | ||
| if not next_boundary_pos then | ||
| -- Last section | ||
| local section = body:sub(current_pos) | ||
| if section and #section > 10 then -- Minimum length check | ||
| table.insert(sections, section) | ||
| end | ||
| break | ||
| end | ||
|
|
||
| -- Extract section from current position to next boundary | ||
| local section = body:sub(current_pos, next_boundary_pos - 1) | ||
| if section and #section > 10 then -- Minimum length check | ||
| table.insert(sections, section) | ||
| end | ||
|
|
||
| current_pos = next_boundary_pos + #boundary | ||
|
|
||
| -- Check for end boundary (ends with --) | ||
| if body:sub(current_pos, current_pos + 1) == "--" then | ||
| break | ||
| end | ||
| end |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Address static analysis warnings and document magic constant.
- Line 79:
current_pos = 1is immediately overwritten at line 87. Remove the initial assignment. - Lines 95, 103: The magic number
10(minimum section length) should be a named constant or documented.
🔧 Proposed fix
+local MIN_SECTION_LEN = 10 -- Minimum bytes for a valid multipart section (headers + CRLF)
+
function clamav:parse_multipart(body, boundary)
if not body or not boundary then
return {}
end
local parts = {}
-
- -- Split body by boundary markers
local sections = {}
- local current_pos = 1
-- Find first boundary
local first_boundary_pos = body:find(boundary, 1, true)
if not first_boundary_pos then
return {}
end
- current_pos = first_boundary_pos + `#boundary`
+ local current_pos = first_boundary_pos + `#boundary`
while true do
local next_boundary_pos = body:find(boundary, current_pos, true)
if not next_boundary_pos then
local section = body:sub(current_pos)
- if section and `#section` > 10 then
+ if section and `#section` > MIN_SECTION_LEN then
table.insert(sections, section)
end
break
end
local section = body:sub(current_pos, next_boundary_pos - 1)
- if section and `#section` > 10 then
+ if section and `#section` > MIN_SECTION_LEN then
table.insert(sections, section)
end🧰 Tools
🪛 Luacheck (1.2.0)
[warning] 79-79: value assigned to variable 'current_pos' is unused
(W311)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` around lines 69 - 113, In function parse_multipart: remove
the redundant initial assignment current_pos = 1 (it's immediately overwritten
after finding first_boundary_pos) and introduce a named constant for the magic
number used to filter tiny sections (replace the literal 10 with e.g. local
MIN_SECTION_LEN = 10 declared near the top of the function and use
MIN_SECTION_LEN in the two checks), optionally add a short comment documenting
why that minimum length exists; reference parse_multipart, current_pos,
first_boundary_pos, and the two section length checks.
| for i, section in ipairs(sections) do | ||
| if section and #section > 0 then | ||
| -- Remove leading whitespace and newlines | ||
| section = section:gsub("^%s*\r?\n", "") | ||
|
|
||
| -- Separate headers and body (\r\n\r\n or \n\n) | ||
| local header_end = section:find("\r\n\r\n", 1, true) | ||
| local separator_len = 4 | ||
| if not header_end then | ||
| header_end = section:find("\n\n", 1, true) | ||
| separator_len = 2 | ||
| end | ||
|
|
||
| if header_end then | ||
| local headers = section:sub(1, header_end - 1) | ||
| local data = section:sub(header_end + separator_len) | ||
|
|
||
| -- Remove trailing \r\n | ||
| data = data:gsub("\r\n$", ""):gsub("\n$", "") | ||
|
|
||
| -- Extract filename from Content-Disposition header | ||
| local filename = self:extract_filename(headers) | ||
| if filename and #data > 0 then | ||
| table.insert(parts, { | ||
| headers = headers, | ||
| data = data, | ||
| filename = filename | ||
| }) | ||
| end | ||
| end | ||
| end | ||
| end |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Use _ for unused loop variable.
Static analysis flags the unused i on line 116.
🧹 Proposed fix
- for i, section in ipairs(sections) do
+ for _, section in ipairs(sections) do📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for i, section in ipairs(sections) do | |
| if section and #section > 0 then | |
| -- Remove leading whitespace and newlines | |
| section = section:gsub("^%s*\r?\n", "") | |
| -- Separate headers and body (\r\n\r\n or \n\n) | |
| local header_end = section:find("\r\n\r\n", 1, true) | |
| local separator_len = 4 | |
| if not header_end then | |
| header_end = section:find("\n\n", 1, true) | |
| separator_len = 2 | |
| end | |
| if header_end then | |
| local headers = section:sub(1, header_end - 1) | |
| local data = section:sub(header_end + separator_len) | |
| -- Remove trailing \r\n | |
| data = data:gsub("\r\n$", ""):gsub("\n$", "") | |
| -- Extract filename from Content-Disposition header | |
| local filename = self:extract_filename(headers) | |
| if filename and #data > 0 then | |
| table.insert(parts, { | |
| headers = headers, | |
| data = data, | |
| filename = filename | |
| }) | |
| end | |
| end | |
| end | |
| end | |
| for _, section in ipairs(sections) do | |
| if section and `#section` > 0 then | |
| -- Remove leading whitespace and newlines | |
| section = section:gsub("^%s*\r?\n", "") | |
| -- Separate headers and body (\r\n\r\n or \n\n) | |
| local header_end = section:find("\r\n\r\n", 1, true) | |
| local separator_len = 4 | |
| if not header_end then | |
| header_end = section:find("\n\n", 1, true) | |
| separator_len = 2 | |
| end | |
| if header_end then | |
| local headers = section:sub(1, header_end - 1) | |
| local data = section:sub(header_end + separator_len) | |
| -- Remove trailing \r\n | |
| data = data:gsub("\r\n$", ""):gsub("\n$", "") | |
| -- Extract filename from Content-Disposition header | |
| local filename = self:extract_filename(headers) | |
| if filename and `#data` > 0 then | |
| table.insert(parts, { | |
| headers = headers, | |
| data = data, | |
| filename = filename | |
| }) | |
| end | |
| end | |
| end | |
| end |
🧰 Tools
🪛 Luacheck (1.2.0)
[warning] 116-116: unused loop variable 'i'
(W213)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` around lines 116 - 147, The loop declares an unused
variable `i` in `for i, section in ipairs(sections) do`; change it to use the
conventional unused-variable name by replacing `i` with `_` (i.e., `for _,
section in ipairs(sections) do`) so static analysis no longer flags the unused
binding, leaving the rest of the loop body (checks, header parsing,
extract_filename call, and table.insert into `parts`) unchanged.
| function clamav:api() | ||
| if self.ctx.bw.uri == "/clamav/ping" and self.ctx.bw.request_method == "POST" then | ||
| local enabled, err = has_variable("USE_CLAMAV", "yes") | ||
| if not enabled then | ||
| return self:ret(true, "ClamAV plugin not enabled") | ||
| end | ||
| local ok, res = self:ping() | ||
| if not ok then | ||
| return self:ret(true, "ClamAV ping failed: " .. tostring(res), HTTP_INTERNAL_SERVER_ERROR) | ||
| end | ||
| return self:ret(true, "ClamAV ping successful", HTTP_OK) | ||
| end | ||
| return self:ret(false, "success") | ||
| end |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Address unused variable from has_variable call.
Line 175 captures err but never uses it. Either use _ or log the error.
🧹 Proposed fix
- local enabled, err = has_variable("USE_CLAMAV", "yes")
+ local enabled = has_variable("USE_CLAMAV", "yes")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function clamav:api() | |
| if self.ctx.bw.uri == "/clamav/ping" and self.ctx.bw.request_method == "POST" then | |
| local enabled, err = has_variable("USE_CLAMAV", "yes") | |
| if not enabled then | |
| return self:ret(true, "ClamAV plugin not enabled") | |
| end | |
| local ok, res = self:ping() | |
| if not ok then | |
| return self:ret(true, "ClamAV ping failed: " .. tostring(res), HTTP_INTERNAL_SERVER_ERROR) | |
| end | |
| return self:ret(true, "ClamAV ping successful", HTTP_OK) | |
| end | |
| return self:ret(false, "success") | |
| end | |
| function clamav:api() | |
| if self.ctx.bw.uri == "/clamav/ping" and self.ctx.bw.request_method == "POST" then | |
| local enabled = has_variable("USE_CLAMAV", "yes") | |
| if not enabled then | |
| return self:ret(true, "ClamAV plugin not enabled") | |
| end | |
| local ok, res = self:ping() | |
| if not ok then | |
| return self:ret(true, "ClamAV ping failed: " .. tostring(res), HTTP_INTERNAL_SERVER_ERROR) | |
| end | |
| return self:ret(true, "ClamAV ping successful", HTTP_OK) | |
| end | |
| return self:ret(false, "success") | |
| end |
🧰 Tools
🪛 Luacheck (1.2.0)
[warning] 175-175: unused variable 'err'
(W211)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` around lines 173 - 186, In clamav:api replace the unused
variable from the has_variable call by either discarding it or using it: change
the call site where you currently do "local enabled, err = has_variable(...)" to
"local enabled, _ = has_variable(...)" to avoid the unused-variable warning, or
alternatively capture and log the error (e.g. use process or self logger to log
err) before returning; target the has_variable call inside the clamav:api
function to implement the change.
| -- Read request body | ||
| ngx.req.read_body() | ||
| local body = ngx.req.get_body_data() | ||
| if not body then | ||
| local file_path = ngx.req.get_body_file() | ||
| if file_path then | ||
| local f = io.open(file_path, "rb") | ||
| if f then | ||
| body = f:read("*a") | ||
| f:close() | ||
| end | ||
| end | ||
| end | ||
|
|
||
| function clamav:add_to_cache(checksum, value) | ||
| local ok, err = self.cachestore:set("plugin_clamav_" .. checksum, value, 86400) | ||
| if not ok then | ||
| return false, err | ||
| end | ||
| return true | ||
| end | ||
| if not body then | ||
| self.logger:log(ERR, "[clamav] Failed to read request body") | ||
| return self:ret(true, "Empty request body") | ||
| end |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if nginx client_max_body_size or similar limits are configured in the test setup
rg -n 'client_max_body_size|CLAMAV.*SIZE|MAX.*BODY' --type-add 'nginx:*.conf' --type=nginx --type=lua --type=jsonRepository: bunkerity/bunkerweb-plugins
Length of output: 53
🏁 Script executed:
# Read the full access() function context around the flagged lines
cd clamav || exit 1
wc -l clamav.luaRepository: bunkerity/bunkerweb-plugins
Length of output: 85
🏁 Script executed:
# View the actual implementation at lines 212-229 with surrounding context
cd clamav || exit 1
sed -n '200,240p' clamav.lua | cat -nRepository: bunkerity/bunkerweb-plugins
Length of output: 1677
🏁 Script executed:
# Search for any size limit configurations or environment variables
cd clamav || exit 1
rg -n 'MAX.*SIZE|SIZE.*LIMIT|body.*size|size.*check' clamav.luaRepository: bunkerity/bunkerweb-plugins
Length of output: 53
🏁 Script executed:
# Check if environment variables define body limits elsewhere in the plugin
cd clamav || exit 1
rg -n 'ngx\.var\.|CLAMAV' clamav.lua | head -30Repository: bunkerity/bunkerweb-plugins
Length of output: 453
🏁 Script executed:
# Check parse_multipart implementation to see if it has size protections
cd clamav || exit 1
rg -A 30 'function.*parse_multipart|local.*parse_multipart' clamav.luaRepository: bunkerity/bunkerweb-plugins
Length of output: 1855
🏁 Script executed:
# Search entire bunkerweb-plugins directory for body size limits
rg -n 'client_max_body_size|MAX_BODY|DEFAULT.*BODY.*SIZE' --type=lua --type=conf --type=json | head -20Repository: bunkerity/bunkerweb-plugins
Length of output: 99
🏁 Script executed:
# Check the full access() function signature and any variable initialization
cd clamav || exit 1
sed -n '180,220p' clamav.lua | cat -nRepository: bunkerity/bunkerweb-plugins
Length of output: 1687
🏁 Script executed:
# Look for any nginx config templates or defaults in the plugin directory
cd clamav || exit 1
find . -name '*.conf' -o -name '*.template' -o -name 'plugin.json' -o -name 'plugin.yaml' | xargs ls -la 2>/dev/nullRepository: bunkerity/bunkerweb-plugins
Length of output: 133
🏁 Script executed:
# Check plugin.json for body size configuration schema
cd clamav || exit 1
cat plugin.jsonRepository: bunkerity/bunkerweb-plugins
Length of output: 1216
🏁 Script executed:
# Look at surrounding plugins to see how they handle body limits
find .. -name 'clamav' -o -name 'coraza' -o -name 'virustotal' | head -5Repository: bunkerity/bunkerweb-plugins
Length of output: 165
🏁 Script executed:
# Check virustotal plugin as the coding guidelines mention it—see how it handles uploads
cd ../virustotal 2>/dev/null && sed -n '1,50p' virustotal.lua | cat -nRepository: bunkerity/bunkerweb-plugins
Length of output: 53
🏁 Script executed:
# Check if there's an example of body size validation in other clamav or similar plugins
rg -n 'ngx.req.get_body_file\|io.open' --type=lua -B 5 -A 5 | head -50Repository: bunkerity/bunkerweb-plugins
Length of output: 53
🏁 Script executed:
# Verify Lua io.read() error handling semantics by checking error case
cd clamav || exit 1
cat > /tmp/test_io_read.lua << 'EOF'
-- Test Lua io behavior
local f = io.open("/nonexistent", "rb")
if f then
local data = f:read("*a")
print("Data:", data)
f:close()
else
print("File not found - f is nil")
end
EOF
lua /tmp/test_io_read.lua 2>&1 || echo "Lua test completed"Repository: bunkerity/bunkerweb-plugins
Length of output: 96
🏁 Script executed:
# Check if nginx config in BunkerWeb sets client_max_body_size defaults
find .. -name '*.conf' -o -name '*.template' | xargs grep -l 'client_max_body_size' 2>/dev/null | head -5Repository: bunkerity/bunkerweb-plugins
Length of output: 53
🏁 Script executed:
# Verify the actual behaviour: does code distinguish between empty body and read error?
cd clamav || exit 1
sed -n '213,230p' clamav.lua | cat -nRepository: bunkerity/bunkerweb-plugins
Length of output: 611
Add explicit body size validation before buffering entire request.
Reading the entire request body into memory without size checks (lines 214–222) risks memory exhaustion on large uploads. Whilst nginx client_max_body_size may provide implicit protection, this plugin should validate explicitly to be resilient across deployments.
Add a configurable CLAMAV_MAX_BODY_SIZE setting (e.g. 50 MiB default) and reject requests exceeding it after line 222, before passing the body to parse_multipart().
Note: The file handle at line 219 is closed correctly even if f:read() fails; no leak risk here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` around lines 212 - 229, The code buffers the entire
request body into memory (the block that reads ngx.req.get_body_data() and falls
back to ngx.req.get_body_file()) without size validation; add a configurable
CLAMAV_MAX_BODY_SIZE (default 50 * 1024 * 1024) and after the body is populated
but before calling parse_multipart() check the byte length (use `#body` or
equivalent) and return an error via self:ret(true, "Request body too large") and
log via self.logger:log(ERR, ...) when the body exceeds the limit; ensure the
new config key is read where the plugin config is loaded and referenced in this
function so deployments can override the default.
| local parts = self:parse_multipart(body, boundary) | ||
|
|
||
| -- Scan each file part with ClamAV | ||
| for i, part in ipairs(parts) do |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Use _ for unused loop variable.
Static analysis flags i as unused.
🧹 Proposed fix
- for i, part in ipairs(parts) do
+ for _, part in ipairs(parts) do📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for i, part in ipairs(parts) do | |
| for _, part in ipairs(parts) do |
🧰 Tools
🪛 Luacheck (1.2.0)
[warning] 235-235: unused loop variable 'i'
(W213)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` at line 235, The loop uses an unused index variable `i` in
the `for i, part in ipairs(parts) do` iteration, which triggers static-analysis
warnings; replace it with the conventional `_` placeholder (i.e., `for _, part
in ipairs(parts) do`) inside the same scope (in the file clamav/clamav.lua where
that loop appears) and confirm there are no references to `i` within the loop
body—if any exist, either restore a meaningful name or use the actual name where
needed instead of `_`.
| -- Send stream termination marker | ||
| sock:send(string.char(0, 0, 0, 0)) | ||
| local result = sock:receive("*l") | ||
| sock:close() |
There was a problem hiding this comment.
Check return value of termination marker send; handle potential nil result.
Line 268 ignores the return value of sock:send() for the stream termination marker. If this fails, the subsequent receive() may hang or return unexpected data.
🔧 Proposed fix
-- Send stream termination marker
- sock:send(string.char(0, 0, 0, 0))
+ local term_ok, term_err = sock:send(string.char(0, 0, 0, 0))
+ if not term_ok then
+ self.logger:log(ERR, "[clamav] Failed to send termination marker: " .. tostring(term_err))
+ sock:close()
+ return self:ret(false, "Failed to send termination marker")
+ end
local result = sock:receive("*l")
sock:close()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| -- Send stream termination marker | |
| sock:send(string.char(0, 0, 0, 0)) | |
| local result = sock:receive("*l") | |
| sock:close() | |
| -- Send stream termination marker | |
| local term_ok, term_err = sock:send(string.char(0, 0, 0, 0)) | |
| if not term_ok then | |
| self.logger:log(ERR, "[clamav] Failed to send termination marker: " .. tostring(term_err)) | |
| sock:close() | |
| return self:ret(false, "Failed to send termination marker") | |
| end | |
| local result = sock:receive("*l") | |
| sock:close() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` around lines 267 - 270, Check the return value of
sock:send when sending the stream termination marker in the function that calls
sock:send(string.char(0,0,0,0)); if send returns nil (failure, err) handle it by
closing the socket (sock:close()), propagate or return an error instead of
proceeding to sock:receive, and only call sock:receive("*l") when send
succeeded; update any callers that rely on the local result variable to handle
the error path accordingly.
| else | ||
| self.logger:log(ERR, "[clamav] ClamAV result for " .. part.filename .. ": " .. tostring(result)) | ||
| end |
There was a problem hiding this comment.
Use appropriate log level for successful scan results.
Line 276 logs successful ClamAV results at ERR level. This pollutes error logs during normal operation. Use NOTICE (which was declared but unused) or ngx.INFO.
🔧 Proposed fix
If retaining NOTICE:
- self.logger:log(ERR, "[clamav] ClamAV result for " .. part.filename .. ": " .. tostring(result))
+ self.logger:log(NOTICE, "[clamav] ClamAV result for " .. part.filename .. ": " .. tostring(result))Alternatively, log at ERR only when malware is found, and use NOTICE or omit logging for clean files.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| else | |
| self.logger:log(ERR, "[clamav] ClamAV result for " .. part.filename .. ": " .. tostring(result)) | |
| end | |
| else | |
| self.logger:log(NOTICE, "[clamav] ClamAV result for " .. part.filename .. ": " .. tostring(result)) | |
| end |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clamav/clamav.lua` around lines 275 - 277, The log at self.logger:log(ERR,
"[clamav] ClamAV result for " .. part.filename .. ": " .. tostring(result)) is
using the ERR level for normal/clean scan results; change it to use the NOTICE
level (which is declared but unused) or ngx.INFO for successful scans, and
reserve ERR only for malware detections or explicit errors (i.e., update the
logging call in the ClamAV result handling to use NOTICE/ngx.INFO for clean
results and log at ERR only when result indicates malware or an error).
Description
This PR improves the ClamAV plugin's multipart form-data parsing to properly support HTTP/2 requests. The original implementation had issues parsing multipart data in HTTP/2 environments, causing file uploads to bypass ClamAV scanning.
Changes Made
Technical Details
I modified the code to support HTTP/2 because the original multipart parsing wasn't working properly. Since I'm not very familiar with Lua code, I got help from ChatGPT and Claude AI to implement these improvements. The updated code has been thoroughly tested and works excellently.
Testing
Impact
This fix ensures that file uploads via HTTP/2 are properly scanned by ClamAV, closing a potential security gap where malicious files could bypass antivirus scanning in HTTP/2 environments.