Skip to content

feat(logs): Implement container version 2#5887

Open
Dav1dde wants to merge 7 commits intomasterfrom
dav1d/dav1d/eap-ingest-settings-logs
Open

feat(logs): Implement container version 2#5887
Dav1dde wants to merge 7 commits intomasterfrom
dav1d/dav1d/eap-ingest-settings-logs

Conversation

@Dav1dde
Copy link
Copy Markdown
Member

@Dav1dde Dav1dde commented Apr 24, 2026

Like #5881 but for logs. Refs: #5869

Integrations will no longer infer the user agent, for log drains this never made sense for OTLP in most cases I would argue it doesn't make sense, as the source often is a drain (e.g. cloudflare) or a collector.

Changes the signature from the integration expansion from a vec of items to a single item, this now leaves potentially a better way to deal with error cases (using actual errors), I will explore this in a follow-up/cleanup which also will address the span integrations (INGEST-883).

@Dav1dde Dav1dde requested a review from a team as a code owner April 24, 2026 09:39
@Dav1dde Dav1dde force-pushed the dav1d/dav1d/eap-ingest-settings-logs branch from c6f6cd7 to 462e585 Compare April 24, 2026 09:40
@Dav1dde Dav1dde self-assigned this Apr 24, 2026
Comment thread relay-server/src/processing/logs/integrations/mod.rs Outdated
@Dav1dde Dav1dde force-pushed the dav1d/dav1d/eap-ingest-settings-logs branch 2 times, most recently from 58781d3 to 43e6944 Compare April 24, 2026 10:17
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 43e6944. Configure here.

Comment thread relay-event-schema/src/protocol/ourlog/container.rs
@Dav1dde Dav1dde force-pushed the dav1d/dav1d/eap-ingest-settings-logs branch from 43e6944 to 6e057b0 Compare April 24, 2026 10:32
Comment thread relay-server/src/processing/logs/mod.rs Outdated
@Dav1dde Dav1dde force-pushed the dav1d/dav1d/eap-ingest-settings-logs branch from 6e057b0 to 5a6087d Compare April 24, 2026 10:38
@Dav1dde Dav1dde requested a review from k-fish April 24, 2026 11:49
Comment thread relay-event-schema/src/protocol/ourlog/container.rs Outdated
/// | Version | client ip | user agent |
/// |---------|--------------------|------------|
/// | none | {{auto}} | always |
/// | 2 | {{auto}}, settings | settings |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way via settings to turn off the auto behaviour?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, is there a reason this would be necessary? {{auto}} is an explicit opt-in from the client. That being said, happy to drop support for {{auto}} altogether and force the use of settings.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if any SDK even ever sent {{auto}}?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked with @chargome, SDKs do send it. We could drop support with version: 2. Gonna take a look, right now code wise, it's easier to always do it, then to conditionally turn it off.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k-fish are you referring to the behavior when the value is missing, or to the behavior when the value is {{auto}}? The former can be disabled by settings infer_ip: never, the latter cannot be disabled other than not sending it (what @Dav1dde said).

///
/// For example in version 2, the client ip is inferred if either the client address attribute is
/// set to `auto` or the ingest settings specify `infer_ip` to `autp`.
/// The user agent is only considered if ingest settings specify `infer_user_agent` as `auto`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do sdks default to auto? I think this will affect behaviour just want to check the default path is still extracting browser name

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old SDKs will not send a version, which defaults to never. New SDKs (starting with an upcoming version) imo should default to version 2 and auto if it's a "frontend" SDK

},
"sentry.origin": {"stringValue": "auto.otlp.logs"},
"sentry.payload_size_bytes": {"intValue": "378"},
"sentry.payload_size_bytes": {"intValue": mock.ANY},
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mock.ANY? shouldn't this be a value to ensure we don't drift?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That mirrors the vercel tests and I realized it is quite annoying to change the value in a lot of places when we have dedicated tests already making sure the counts are correct. Overall not a strong opinion, happy to bring it back if you/we think there is value in the assertion.

Comment on lines +35 to 38
// Log byte counts will change here as we go from an estimated count based on the body, to
// accurately counted bytes.
records.lenient(DataCategory::LogByte);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a feeling if this will cause a large deviation to peoples byte counts?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a feeling if this will cause a large deviation to peoples byte counts?

Right now, it's probably somewhat close and a relatively constant ratio, but in the future this may change drastically and that was the intention. E.g. if we come up with a format where we allow SDKs to inherit common attributes into each individual log (aka some kind of compression).

@Dav1dde Dav1dde force-pushed the dav1d/dav1d/eap-ingest-settings-logs branch from 5a6087d to 895c13f Compare April 28, 2026 11:16
@Dav1dde Dav1dde force-pushed the dav1d/dav1d/eap-ingest-settings-logs branch from 895c13f to 0921b72 Compare April 28, 2026 11:17
Comment thread relay-server/src/processing/logs/process.rs
Copy link
Copy Markdown
Contributor

@loewenheim loewenheim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside from a stylistic nit.

Comment thread relay-server/src/processing/logs/integrations/mod.rs Outdated
Comment thread relay-server/src/processing/logs/process.rs
Comment thread relay-event-schema/src/protocol/ourlog/container.rs Outdated
skip_serializing_if = "Option::is_none",
deserialize_with = "utils::none_on_error"
)]
pub infer_user_agent: Option<InferUserAgent>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these have to be Option? Could they fall back to an enum variant instead?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option handles missing and error decently well for serialization and de-serialization.

Without the option, we either need to default to a value, in which case different protocol version no longer can have different behaviours (v1 missing is auto, v2 is never).

I considered an error variant, but it is significantly more annoying to deal with, requires a custom serialize impl (and skip_serializing_if function), with no upside (at this time).

/// Specifically frontend SDKs may want to have the user agent and browser information automatically inferred
/// from the HTTP header provided user agent.
#[derive(Debug, Copy, Clone, Deserialize, Serialize)]
pub enum InferUserAgent {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason InferIp and InferUserAgent are separate types? Could they be united into something like InferBehavior?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if these are good enough reasons tbh:

  • Allow easy independent changes to the protocol (similar to duplicating the container.rs)
  • Better doc strings
  • Slightly stricter typing.

Happy to change, really on the edge on this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two types are fine with me 👍

records.modify_by(DataCategory::LogItem, 1);
records.modify_by(DataCategory::LogByte, byte_size as isize);
let mut logs = Vec::new();
let produce = |log: OurLog| {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is produce basically the merge_into that @elramen is writing (just less generic)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda, but I am not sure if we can use it here, this code is already in a try_modify. Gonna look into it in the follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants