Skip to content

Add Prometheus translation strategy support#8346

Draft
zeitlinger wants to merge 10 commits intoopen-telemetry:mainfrom
zeitlinger:codex/prometheus-translation-strategy
Draft

Add Prometheus translation strategy support#8346
zeitlinger wants to merge 10 commits intoopen-telemetry:mainfrom
zeitlinger:codex/prometheus-translation-strategy

Conversation

@zeitlinger
Copy link
Copy Markdown
Member

@zeitlinger zeitlinger commented Apr 29, 2026

Blocked by prometheus/client_java#2100

Summary

  • add TranslationStrategy support to the Prometheus exporter builder and declarative config
  • upgrade Prometheus client_java to 1.6.1, which provides the released naming support this needs
  • preserve existing content negotiation behavior, while enabling non-default strategies to use OM2 name-preservation semantics
  • align underscore-escaping label-name normalization with the OTel Prometheus compatibility rules / prometheus/otlptranslator behavior for invalid characters, repeated underscores, and digit-leading labels
  • keep OTLP-to-Prometheus naming translation owned by the OTel exporter instead of relying on Prometheus Java naming helpers

Notes

  • For non-default translation strategies, this intentionally uses only OM2 name-preservation semantics.
  • No other OM2 features are enabled.
  • Content negotiation remains unchanged, so OpenMetrics responses still use the legacy OM1 content type by design.
  • The behavior change from the new Prometheus release is that metric names can now be passed through as-is for the relevant strategies.
  • prometheus/otlptranslator preserves labels normalized to __...__; Prometheus Java rejects user labels starting with __, so those labels are collapsed to a valid single-underscore form instead.
  • Prometheus Java is still used for the metrics model and exposition, but it only validates and serializes names here; it does not own the OTLP naming mangling.

References

Test plan

  • ./gradlew :exporters:prometheus:test --tests io.opentelemetry.exporter.prometheus.PrometheusHttpServerTest.fetchOpenMetrics --tests io.opentelemetry.exporter.prometheus.PrometheusHttpServerTest.fetchOpenMetrics_translationStrategyEnablesOm2 --tests io.opentelemetry.exporter.prometheus.PrometheusMetricReaderTest --tests io.opentelemetry.exporter.prometheus.Otel2PrometheusConverterTest --tests io.opentelemetry.exporter.prometheus.internal.PrometheusMetricReaderProviderTest :sdk-extensions:declarative-config:test --tests io.opentelemetry.sdk.autoconfigure.declarativeconfig.MetricReaderFactoryTest.create_PullPrometheusConfigured

Resolves #8195

Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 83.42857% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.82%. Comparing base (df8063b) to head (957e2e7).
⚠️ Report is 24 commits behind head on main.

Files with missing lines Patch % Lines
...try/exporter/prometheus/PrometheusUnitsHelper.java 66.66% 6 Missing and 7 partials ⚠️
.../exporter/prometheus/Otel2PrometheusConverter.java 88.67% 5 Missing and 7 partials ⚠️
...ometheus/internal/PrometheusComponentProvider.java 55.55% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #8346      +/-   ##
============================================
+ Coverage     90.31%   90.82%   +0.50%     
- Complexity     7725     8047     +322     
============================================
  Files           850      899      +49     
  Lines         23259    24204     +945     
  Branches       2364     2429      +65     
============================================
+ Hits          21007    21983     +976     
+ Misses         1528     1470      -58     
- Partials        724      751      +27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
@zeitlinger zeitlinger marked this pull request as ready for review April 29, 2026 12:36
@zeitlinger zeitlinger requested a review from a team as a code owner April 29, 2026 12:36
Copy link
Copy Markdown
Contributor

@psx95 psx95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also might wanna look at this relevant spec issue: open-telemetry/opentelemetry-specification#5062

new LinkedBlockingQueue<>(),
new DaemonThreadFactory("prometheus-http-server"));
}
HTTPServer.Builder httpServerBuilder = HTTPServer.builder();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Based on the "Interaction with Translation Strategy" spec, I was expecting to see some content negotiation, basically reading accept headers somewhere.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's park this until open-telemetry/opentelemetry-specification#5062 is resolved.


assertThat(response.status()).isEqualTo(HttpStatus.OK);
assertThat(response.headers().get(HttpHeaderNames.CONTENT_TYPE))
.isEqualTo("application/openmetrics-text; version=1.0.0; charset=utf-8");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the accept header here also have the escaping scheme like shown here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's park this until open-telemetry/opentelemetry-specification#5062 is resolved.

@@ -239,6 +239,50 @@ void fetchOpenMetrics() {
+ "# EOF\n");
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be more tests that test behavior with other possible escaping schemes?

The spec only mentions two in the examples, but I imagine its because its meant to be non-exhaustive list.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's park this until open-telemetry/opentelemetry-specification#5062 is resolved.

Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Copy link
Copy Markdown
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments and ideas. Thanks!

throw new IllegalStateException("Unknown strategy: " + translationStrategy);
}

private static MetricMetadata convertMetadataEscapedWithSuffixes(MetricData metricData) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find these convert methods in the weeds and hard to follow / verify. Going to have to trust that you've done the research.

Could help improve the readability by:

  • Computing the MetricMetadata constructor params in a consistent order the same as MetricMetadata accepts (i.e. name, expositionName, originalName, help, unit)
  • Using variable names which are the same as the param names of MetricMetadata
  • Always using the same constructor overload so its explicit which things we're setting to null

For example, applied to this method it might look like:

  private static MetricMetadata convertMetadataEscapedWithSuffixes(MetricData metricData) {
    String name = convertLegacyMetricName(metricData.getName());
    name = stripReservedMetricSuffixes(name);
    String expositionBaseName = name;
    String originalName = name;
    String help = metricData.getDescription();
    Unit unit = PrometheusUnitsHelper.convertUnit(metricData.getUnit());
    
    if (unit != null && !name.endsWith(unit.toString())) {
      name = name + "_" + unit;
    }
    validateNormalizedMetricName(metricData.getName(), name);
    
    return new MetricMetadata(name, expositionBaseName, originalName, help, unit);
  }

Something to think about to make it easier for humans to reason about / maintain.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied to convertMetadataEscapedWithSuffixes (consistent ordering, single 5-arg MetricMetadata ctor) per your example. Left the other three strategy methods as-is for now since they read OK to me; happy to apply the same shape across all four if you'd prefer consistency. Commit 957e2e7.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... let me think about this more.

@zeitlinger zeitlinger marked this pull request as draft May 8, 2026 06:54
- Replace IllegalArgumentException control flow in PrometheusUnitsHelper.sanitizeUnitName with @nullable return; drop the try/catch in unitOrNull.
- Extract doConvert helper so convert is just the IAE try/catch boundary.
- Inline the getMergeKey ternary at the putOrMerge call site.
- Reorder convertMetadataEscapedWithSuffixes for readability and use the explicit 5-arg MetricMetadata constructor.

Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Prometheus translation_strategy configuration

3 participants