Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid conflicts in Micrometer description mapping #5452

Merged
merged 3 commits into from
Mar 8, 2022

Conversation

fstab
Copy link
Member

@fstab fstab commented Feb 25, 2022

Background: How PrometheusMeterRegistry Maps Descriptions

The following shows an example of metrics in Prometheus format as produced by Micrometer's PrometheusMeterRegistry:

# HELP logback_events_total Number of error level events that made it to the logs
# TYPE logback_events_total counter
logback_events_total{level="warn"} 2.0
logback_events_total{level="debug"} 0.0
logback_events_total{level="error"} 2.0
logback_events_total{level="trace"} 0.0
logback_events_total{level="info"} 6.0

If you look closely at the HELP text, you will see a slight mismatch: The description is specific to level="error" and not to other attribute values.

The reason is that in Micrometer you can define descriptions per set of attributes. It is not very common to have different descriptions per set of attributes, but there are a few real-world examples like Micrometer's built-in LogbackMetrics

https://github.com/micrometer-metrics/micrometer/blob/fc9c4eed843f02f00cc6f3c9621aed5d4797e29d/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/logging/LogbackMetrics.java#L139-L169

As Prometheus supports only one description per metric, Micrometer's PrometheusMeterRegistry maintains a collectorMap that caches the description per metric name, so metrics with the same name will get the same description. As a result, all metrics named logback_events_total will get the first description that was registered, which happens to be the description for level="error" in the example above.

https://github.com/fstab/micrometer/blob/31488cc4ca1232f6599f9479ddb6b8445ece3d06/implementations/micrometer-registry-prometheus/src/main/java/io/micrometer/prometheus/PrometheusMeterRegistry.java#L482-L502

This PR: Make OpenTelemetryMeterRegistry use The Same Strategy as PrometheusMeterRegistry

The OpenTelemetryMeterRegistry has the same mismatch with Micrometer, as OTel metrics also don't support different descriptions for different attribute values. However, the current implementation does not handle this, and eventually the SDK throws a DuplicateMetricStorageException when Micrometer's LogbackMetrics register metrics with the same name but different descriptions:

https://github.com/open-telemetry/opentelemetry-java/blob/b19157ed98ffc81d5083752631b04e4074ea71f6/sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/state/MetricStorageRegistry.java#L61-L67

This will happen whenever you use the OpenTelemetryMeterRegistry in a Spring application and Spring's Micrometer auto-detects Logback.

This PR introduces a descriptionsCache to mimic the behavior of the Micrometer -> Prometheus mapping. We cache the first description seen for each metric name and re-use it for all attributes.

This will result in descriptions that don't fit to the attributes (as in the example above). However, in practice this slight mismatch in the description has never been reported as an issue by Prometheus users (to my knowledge), and I think it is better than throwing a DuplicateMetricStorageException when Spring detects Logback.

@fstab fstab requested a review from a team as a code owner February 25, 2022 16:46
Signed-off-by: Fabian Stäber <fabian@fstab.de>
@fstab fstab force-pushed the micrometer-description-mapping branch from 7725216 to 1c29dfa Compare February 25, 2022 17:30
@anuraaga
Copy link
Contributor

Hi @fstab - a recent change to the OTel spec will stop throwing exceptions when registering metrics with the same name but different description. Do you know if that will be enough to address the issue without changes in this bridge?

Copy link
Member

@mateuszrzeszutek mateuszrzeszutek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with merging this fix before the spec/SDK officially supports that use case - left one comment about adding a TODO for the future.
Thanks! 👍

…a/io/opentelemetry/instrumentation/micrometer/v1_5/Bridging.java

Co-authored-by: Mateusz Rzeszutek <mrzeszutek@splunk.com>
Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might read easier with conventionName -> name everywhere.

(my brain kept reading it as the name of a convention, not the name of a meter)

@trask trask added this to the v1.12.0 milestone Mar 8, 2022
@trask trask merged commit b6a5f28 into open-telemetry:main Mar 8, 2022
@trask
Copy link
Member

trask commented Mar 8, 2022

thanks @fstab!

RashmiRam pushed a commit to RashmiRam/opentelemetry-auto-instr-java that referenced this pull request May 23, 2022
* Avoid conflicts in Micrometer description mapping

Signed-off-by: Fabian Stäber <fabian@fstab.de>

* fix formatting

* Update instrumentation/micrometer/micrometer-1.5/library/src/main/java/io/opentelemetry/instrumentation/micrometer/v1_5/Bridging.java

Co-authored-by: Mateusz Rzeszutek <mrzeszutek@splunk.com>

Co-authored-by: Mateusz Rzeszutek <mrzeszutek@splunk.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants