Skip to content

feat(ingest): capture request headers into event context#1343

Open
vklimontovich wants to merge 1 commit into
newjitsufrom
feat/event-context-headers
Open

feat(ingest): capture request headers into event context#1343
vklimontovich wants to merge 1 commit into
newjitsufrom
feat/event-context-headers

Conversation

@vklimontovich
Copy link
Copy Markdown
Contributor

What

Adds context.headers to ingested events so destinations can see the raw HTTP request headers (accept, content-type, sec-fetch-*, sec-ch-ua*, …) and distinguish real browser traffic from bots/agents. Today only context.userAgent is available.

Behavior

  • Browser endpointcontext.headers is derived only from the actual request; the body can't redefine them (a browser can't read its own request headers anyway, and shouldn't be able to spoof them).
  • S2S endpoint — captures the forwarding request's headers, but lets the caller override allow-listed headers via the event body, so a server-side SDK can forward the original device's headers. Allow-list: accept, accept-language, accept-encoding, content-type, user-agent, referer, dnt, sec-fetch-*, sec-ch-ua*.
  • cookie / authorization are stripped and the write key is masked before headers reach context (which is forwarded to destinations). Keys are lower-cased. The internal IngestMessage.HttpHeaders (full set) is unchanged.

Types

  • AnalyticsContext.headers?: Record<string, string> (Jitsu extension — Segment's spec has no raw-headers field).
  • Optional RuntimeFacade.headers() so a Node integration can supply the original device's headers; @jitsu/js wires it into the built context (no-op in the browser).

Notes for bot detection

The sec-fetch-* / sec-ch-ua* set is the strongest tell — raw HTTP clients (curl, python-requests, most non-browser agents) don't send them; only real/headless browsers do.

🤖 Generated with Claude Code

Add context.headers to the event so destinations can see the raw HTTP
headers (accept, content-type, sec-fetch-*, sec-ch-ua*, ...) and tell
real browser traffic from bots/agents.

- Browser endpoint derives context.headers from the request only; the
  body can't redefine them (a browser can't read its own headers anyway).
- S2S endpoint captures the forwarding request's headers but lets the
  caller override allow-listed headers via the body to forward the
  original device's headers.
- cookie/authorization are stripped and the write key is masked, so
  secrets don't leak to destinations.

Types: add AnalyticsContext.headers and an optional RuntimeFacade.headers()
so a Node integration can supply the original headers; jitsu-js wires it
into the built context (no-op in the browser).
@vklimontovich vklimontovich requested a review from absorbb June 3, 2026 21:07
Copy link
Copy Markdown

@jitsu-code-review jitsu-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the changes in bulker/ingest/router.go, libs/jitsu-js/src/analytics-plugin.ts, and types/protocols/analytics.d.ts.

The overall direction makes sense (capturing request headers into context.headers and masking sensitive values), but I found one correctness/security edge case in the Go implementation and left an inline comment with details.

Comment thread bulker/ingest/router.go
// event.context.headers. Internal (x-jitsu-*, x-vercel*) and sensitive (cookie,
// authorization) headers are dropped, and the write key is masked. Allow-listed headers
// already present in the event body (bodyHeaders) win over the request headers.
func buildContextHeaders(c *gin.Context, bodyHeaders any) map[string]string {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildContextHeaders returns a plain map[string]string, and that gets written into event.context.headers. In the browser path we call types.FilterEvent(ev) after this, but FilterEvent only recurses through types.Json/[]any, so it won’t sanitize keys inside this map. That means a crafted header like __sql_type_* can survive and become a SQL type hint downstream after JSON reparse. Can we return types.Json here (or otherwise run equivalent filtering) so header keys go through the same sanitization path?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant