Conversation
c6f6cd7 to
462e585
Compare
58781d3 to
43e6944
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 43e6944. Configure here.
43e6944 to
6e057b0
Compare
6e057b0 to
5a6087d
Compare
| /// | Version | client ip | user agent | | ||
| /// |---------|--------------------|------------| | ||
| /// | none | {{auto}} | always | | ||
| /// | 2 | {{auto}}, settings | settings | |
There was a problem hiding this comment.
Is there a way via settings to turn off the auto behaviour?
There was a problem hiding this comment.
No, is there a reason this would be necessary? {{auto}} is an explicit opt-in from the client. That being said, happy to drop support for {{auto}} altogether and force the use of settings.
There was a problem hiding this comment.
Do you know if any SDK even ever sent {{auto}}?
There was a problem hiding this comment.
Checked with @chargome, SDKs do send it. We could drop support with version: 2. Gonna take a look, right now code wise, it's easier to always do it, then to conditionally turn it off.
| /// | ||
| /// For example in version 2, the client ip is inferred if either the client address attribute is | ||
| /// set to `auto` or the ingest settings specify `infer_ip` to `autp`. | ||
| /// The user agent is only considered if ingest settings specify `infer_user_agent` as `auto`. |
There was a problem hiding this comment.
Do sdks default to auto? I think this will affect behaviour just want to check the default path is still extracting browser name
There was a problem hiding this comment.
Old SDKs will not send a version, which defaults to never. New SDKs (starting with an upcoming version) imo should default to version 2 and auto if it's a "frontend" SDK
| }, | ||
| "sentry.origin": {"stringValue": "auto.otlp.logs"}, | ||
| "sentry.payload_size_bytes": {"intValue": "378"}, | ||
| "sentry.payload_size_bytes": {"intValue": mock.ANY}, |
There was a problem hiding this comment.
mock.ANY? shouldn't this be a value to ensure we don't drift?
There was a problem hiding this comment.
That mirrors the vercel tests and I realized it is quite annoying to change the value in a lot of places when we have dedicated tests already making sure the counts are correct. Overall not a strong opinion, happy to bring it back if you/we think there is value in the assertion.
| // Log byte counts will change here as we go from an estimated count based on the body, to | ||
| // accurately counted bytes. | ||
| records.lenient(DataCategory::LogByte); | ||
|
|
There was a problem hiding this comment.
Do you have a feeling if this will cause a large deviation to peoples byte counts?
There was a problem hiding this comment.
Do you have a feeling if this will cause a large deviation to peoples byte counts?
Right now, it's probably somewhat close and a relatively constant ratio, but in the future this may change drastically and that was the intention. E.g. if we come up with a format where we allow SDKs to inherit common attributes into each individual log (aka some kind of compression).
5a6087d to
895c13f
Compare
895c13f to
0921b72
Compare
loewenheim
left a comment
There was a problem hiding this comment.
LGTM aside from a stylistic nit.
| skip_serializing_if = "Option::is_none", | ||
| deserialize_with = "utils::none_on_error" | ||
| )] | ||
| pub infer_user_agent: Option<InferUserAgent>, |
There was a problem hiding this comment.
Why do these have to be Option? Could they fall back to an enum variant instead?
There was a problem hiding this comment.
Option handles missing and error decently well for serialization and de-serialization.
Without the option, we either need to default to a value, in which case different protocol version no longer can have different behaviours (v1 missing is auto, v2 is never).
I considered an error variant, but it is significantly more annoying to deal with, requires a custom serialize impl (and skip_serializing_if function), with no upside (at this time).
| /// Specifically frontend SDKs may want to have the user agent and browser information automatically inferred | ||
| /// from the HTTP header provided user agent. | ||
| #[derive(Debug, Copy, Clone, Deserialize, Serialize)] | ||
| pub enum InferUserAgent { |
There was a problem hiding this comment.
Any reason InferIp and InferUserAgent are separate types? Could they be united into something like InferBehavior?
There was a problem hiding this comment.
I am not sure if these are good enough reasons tbh:
- Allow easy independent changes to the protocol (similar to duplicating the
container.rs) - Better doc strings
- Slightly stricter typing.
Happy to change, really on the edge on this.
| records.modify_by(DataCategory::LogItem, 1); | ||
| records.modify_by(DataCategory::LogByte, byte_size as isize); | ||
| let mut logs = Vec::new(); | ||
| let produce = |log: OurLog| { |
There was a problem hiding this comment.
Is produce basically the merge_into that @elramen is writing (just less generic)?
There was a problem hiding this comment.
Kinda, but I am not sure if we can use it here, this code is already in a try_modify. Gonna look into it in the follow-up.

Like #5881 but for logs. Refs: #5869
Integrations will no longer infer the user agent, for log drains this never made sense for OTLP in most cases I would argue it doesn't make sense, as the source often is a drain (e.g. cloudflare) or a collector.
Changes the signature from the integration expansion from a vec of items to a single item, this now leaves potentially a better way to deal with error cases (using actual errors), I will explore this in a follow-up/cleanup which also will address the span integrations (INGEST-883).