Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 46 additions & 5 deletions crates/common/src/cookies.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,13 +61,38 @@ pub fn handle_request_cookies(

/// Creates a synthetic ID cookie string.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 P3 — Missing # Examples doc section

Per CLAUDE.md line 198: "Add # Examples sections for public API functions." The doc comment is already excellent — thorough security rationale, # Panics section — but lacks # Examples. Minor, and other functions in this file don't have them either, so this is a nit rather than a blocker.

///
/// Generates a properly formatted cookie with security attributes
/// for storing the synthetic ID.
/// Generates a `Set-Cookie` header value with the following security attributes:
/// - `Secure`: transmitted over HTTPS only.
/// - `HttpOnly`: inaccessible to JavaScript (`document.cookie`), blocking XSS exfiltration.
/// Safe to set because integrations receive the synthetic ID via the `x-synthetic-id`
/// response header instead of reading it from the cookie directly.
/// - `SameSite=Lax`: sent on same-site requests and top-level cross-site navigations.
/// `Strict` is intentionally avoided — it would suppress the cookie on the first
/// request when a user arrives from an external page, breaking first-visit attribution.
/// - `Max-Age`: 1 year retention.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 — Missing # Panics doc section

Per project conventions, functions that can panic should document it. The debug_assert! macros below will panic in debug/test builds. Consider adding:

/// # Panics
///
/// Panics in debug builds if `synthetic_id` or `cookie_domain` contains
/// cookie metacharacters (`;`, `=`, `\n`, `\r`).

///
/// # Panics
///
/// Panics if `cookie_domain` in settings contains cookie metacharacters (`;`, `\n`, `\r`).
/// This indicates a configuration error and is enforced in all build profiles.
#[must_use]
pub fn create_synthetic_cookie(settings: &Settings, synthetic_id: &str) -> String {
// Sanitize synthetic_id at runtime: strip cookie metacharacters to prevent
// header injection when the ID originates from untrusted input (e.g., the
// x-synthetic-id request header or an inbound cookie).
let safe_id: String = synthetic_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 P1 — Silent sanitization creates header/cookie mismatch

In publisher.rs:337-338, the raw synthetic_id is set as the x-synthetic-id response header, then the sanitized value goes into the cookie:

response.set_header(HEADER_X_SYNTHETIC_ID, synthetic_id.as_str()); // raw
set_synthetic_cookie(settings, &mut response, synthetic_id.as_str()); // sanitized inside

If sanitization ever strips characters, the response header and cookie will contain different IDs — silently breaking any downstream system that correlates them.

Minimal fix for this PR: Add a log::warn! when sanitization actually modifies the value, so operators can detect the mismatch:

if safe_id.len() != synthetic_id.len() {
    log::warn!(
        "Stripped cookie metacharacters from synthetic_id (len {} -> {})",
        synthetic_id.len(),
        safe_id.len(),
    );
}

Follow-up: Move sanitization upstream to get_or_generate_synthetic_id() in synthetic.rs so both the header and cookie use the same cleaned value.

.chars()
.filter(|c| !matches!(c, ';' | '=' | '\n' | '\r'))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 P2 — Allowlist is safer than denylist for untrusted input

The synthetic_id can arrive from the untrusted x-synthetic-id request header (synthetic.rs:130-136). The current denylist strips ;, =, \n, \r but misses NUL bytes, spaces, tabs, and other control characters that could cause issues in HTTP header serialization.

Since synthetic IDs have a known format ({64-char-hex}.{6-char-alphanumeric} per synthetic.rs:110), an allowlist is strictly safer:

let safe_id: String = synthetic_id
    .chars()
    .filter(|c| c.is_ascii_alphanumeric() || matches!(c, '.' | '-' | '_'))
    .collect();

This is a defense-in-depth improvement — the denylist works for known attack vectors, but the allowlist eliminates entire categories of future risk.

.collect();
// `=` is excluded from the domain check: it only has special meaning in the
// name=value pair, not within an attribute like Domain.
assert!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 P1 — assert! should be a #[validate] on Publisher, not a per-request panic

This is the only assert! in production (non-test) code in the entire codebase, and it runs on every request. Since cookie_domain is baked into the binary at compile time via include_bytes!(), the value is immutable at runtime — if it's bad, every single request triggers a WASM trap (ungraceful 500, no structured logging, no to_error_response() fallback).

The project already has an established pattern for config validation: Publisher derives Validate, and origin_url uses #[validate(custom(function = validate_no_trailing_slash))]. The cookie_domain field has no validation at all — not even a length check.

Suggestion: Add a custom validator on Publisher::cookie_domain and remove the assert!:

// In settings.rs, on the Publisher struct:
#[validate(length(min = 1), custom(function = validate_cookie_domain))]
pub cookie_domain: String,

// Validator:
fn validate_cookie_domain(domain: &str) -> Result<(), validator::ValidationError> {
    if domain.contains([';', '\n', '\r']) {
        return Err(validator::ValidationError::new("cookie_domain contains cookie metacharacters"));
    }
    Ok(())
}

This way bad config fails the build (from_toml_and_env() calls settings.validate()) and fails at startup (get_settings() calls settings.validate()) — never at request time.

!settings.publisher.cookie_domain.contains([';', '\n', '\r']),
"cookie_domain should not contain cookie metacharacters"
);
format!(
"{}={}; Domain={}; Path=/; Secure; SameSite=Lax; Max-Age={}",
COOKIE_SYNTHETIC_ID, synthetic_id, settings.publisher.cookie_domain, COOKIE_MAX_AGE,
"{}={}; Domain={}; Path=/; Secure; HttpOnly; SameSite=Lax; Max-Age={}",
COOKIE_SYNTHETIC_ID, safe_id, settings.publisher.cookie_domain, COOKIE_MAX_AGE,
)
}

Expand Down Expand Up @@ -174,12 +199,28 @@ mod tests {
assert_eq!(
result,
format!(
"{}=12345; Domain={}; Path=/; Secure; SameSite=Lax; Max-Age={}",
"{}=12345; Domain={}; Path=/; Secure; HttpOnly; SameSite=Lax; Max-Age={}",
COOKIE_SYNTHETIC_ID, settings.publisher.cookie_domain, COOKIE_MAX_AGE,
)
);
}

#[test]
fn test_create_synthetic_cookie_sanitizes_metacharacters_in_id() {
let settings = create_test_settings();
let result = create_synthetic_cookie(&settings, "evil;injected\r\nfoo=bar");
// Extract the value portion anchored to the cookie name constant to
// avoid false positives from metacharacters in cookie attributes.
let value = result
.strip_prefix(&format!("{}=", COOKIE_SYNTHETIC_ID))
.and_then(|s| s.split_once(';').map(|(v, _)| v))
.expect("should have cookie value portion");
assert_eq!(
value, "evilinjectedfoobar",
"should strip metacharacters and preserve safe chars"
);
}

#[test]
fn test_set_synthetic_cookie() {
let settings = create_test_settings();
Expand Down
Loading