Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype Log4j2 Appender #4375

Merged
merged 12 commits into from
Nov 29, 2021
Merged

Conversation

jack-berg
Copy link
Member

@jack-berg jack-berg commented Oct 13, 2021

EDIT - keeping the description below around, but all the issues it mentions have been solved in the 1.9.0 release of the log SDK. This is no longer a prototype.

This is meant to prompt discussion. Proves out what an OpenTelemetryAppender might look like for log4j2. The idea behind this appender is that it delivers log records to a opentelemetry LogSink, where they can be batched and shipped out of process via an exporter, such as the recently added OTLP grpc and http log exporters.

One of the main problems I had to address was that log appenders are instantiated before the opentelemetry log sdk is available. To get around this, each OpenTelemetryAppender instance is statically registered after initialization. Once the otel log sdk is available, it is passed to each OpnTelemetryAppender instance via a static method, upon which log records start flowing.

Learnings so far:

  • LogRecord trace id and span id need to be nullable. And potentially other properties too. The otlp grpc and http exporters currently fail to serialize if span id and trace id are left as their default "" values. Solved here.
  • We need a InMemoryLoggingExporter for testing. Mocked it out in this code, and will contribute back to opentelemetry-java. Solved here.
  • We need a SimpleLogProcessor for testing. Mocked it out in this code, and will contribute back to opentelemetry-java. Solved here.
  • LogSinkSdkProvider.builder() is not public but needs to be for it to be useful. I'll make it so. Solved here.
  • LogSinkSdkProvider should be immutable after its created. Log processors should be registered with the builder, not with the instance. I'll make it so. Solved here.
  • Thought needs to be put into how resource and instrumentation library get attached to the log records. Right now and producer to LogSink is responsible for creating LogRecords with resource and instrumenation library attached. This means that components like the OpenTelemetryAppender need to maintain fields for these. Should LogSink accept LogRecords without resource and instrumenation library, and let the pipeline attach those?

def sdkSinkProvider = LogSinkSdkProvider.builder().build()
sdkSinkProvider.addLogProcessor(SimpleLogProcessor.create(logExporter))
def logSink = sdkSinkProvider.get(null, null)
OpenTelemetryLog4j.initialize(logSink, resource)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the best demonstration of the usage right now. When this is ready to open as a real PR, I'll add docs outlining usage to the readme.

A user would add the opentelemetry appender to their config:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN" packages="io.opentelemetry.instrumentation.log4j.v2_13_2">
  <Appenders>
    <Console name="Console" target="SYSTEM_OUT">
      <PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} traceId: %X{trace_id} spanId: %X{span_id} flags: %X{trace_flags} - %msg%n" />
    </Console>
    <OpenTelemetry name="OpenTelemetryAppender" />
  </Appenders>
  <Loggers>
    <Root>
      <AppenderRef ref="OpenTelemetryAppender" leve="All" />
      <AppenderRef ref="Console" level="All" />
    </Root>
  </Loggers>
</Configuration>

Then initialize the appenders by calling OpenTelemetryLog4j.initialize(LogSink, Resource) as early in the application lifecycle as possible.

Any logs recorded before initializing are dropped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a reasonable approach for now. Is it possible to build more complex configuration that would allow log4j to run the initialization itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that would be possible. Log4j appenders builders have annotation based scheme for configuration properties. However, if configuration for logging is complex like it is for metrics & tracing, it will be inconvenient to maintain. It would be almost like an autoconfiguration module just for log4j.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that makes sense. Maybe we provide a standard couple of options, but then fall back to the manual init if that doesn't work for people.

@jack-berg
Copy link
Member Author

Now that 1.9.0 has been released, I've updated this branch, refined the logic, and added more thorough tests.

Its now ready for review.

NOTE this still places an explicit dependency on the otel java 1.9.0 log SDK. I think the whole repo should be updated to use 1.9.0 before this is merged.

@@ -3,11 +3,15 @@ plugins {
}

dependencies {
api("io.opentelemetry:opentelemetry-sdk-logs:1.9.0-alpha")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will make all clients of this library instrumentation depend on SDK. I assume all users must depend only on API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users need to call OpenTelemetryLog4j.initialize(SdkLogEmitterProvider) to wire opentelemetry into the appender so the log SDK is actually part of this module's API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this reminds me that I need to update the readme to show usage!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iNikem Yeah - we have open-telemetry/opentelemetry-java#3807 to track fixing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sleeping on this, I realized we can keep this code here by making this compileOnly - anyone adding the appender has to add the SDK to their classpath themselves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I still think it might be simpler to continue to prototype with the code in the SDK repo. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely wary about adding an API API to support this. I think compileOnly and a reflective check for the presence of an SDK is a better way to go until we have a spec for a new logging API.

Copy link
Member Author

@jack-berg jack-berg Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compileOnly dependency on the SDK sounds reasonable to me, and potentially a path forward for all appenders.

Though I still think it might be simpler to continue to prototype with the code in the SDK repo.

Prototyping in the SDK has the advantage that the SDK and the appender can develop in lockstep, rather than the appender waiting on SDK releases, which is nice. But I think a lot of the SDK flux has been flushed out (of course assuming no significant changes to the experimental log spec) and so I don't expect to see much value from having the appender in the SDK.

if (level != null) {
builder.setSeverity(
LEVEL_SEVERITY_MAP.getOrDefault(level, Severity.UNDEFINED_SEVERITY_NUMBER));
builder.setSeverityText(logEvent.getLevel().name());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
builder.setSeverityText(logEvent.getLevel().name());
builder.setSeverityText(level.name());

And shouldn't this text come from our severity and not logging framework level?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the spec, the severity text field "is the original string representation of the severity as it is known at the source".

But this does bring up the point that I have made a number of subjective decisions in mapping between the Log4j2 LogEvent data model and the otel log data model. Please take a close look at the attribute name's I've chosen for various things and let me know if you agree / disagree.

@GuardedBy("LOCK")
private static final List<OpenTelemetryAppender> APPENDERS = new ArrayList<>();

public static void initialize(SdkLogEmitterProvider sdkLogEmitterProvider) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this going to work as a javaagent instrumentation? In the agent we keep all the SDK classes hidden in the agent classloader, they're basically unavailable from the application classloader. The agent won't be able to register its emitterProvider in the application's OpenTelemetryLog4j class.
Was the idea about "log appender api" completely scrapped? What happened with that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a stab at extracting out the appender API here.

Copy link
Contributor

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a new artifact, maybe log4j-logssdk-prototype since we can't leak the SDK dependency into existing users of the log4j instrumentation.

If a temporary artifact in Maven Central is not palatable (it's not great TBH), I think we should hold off for the 1.9.0 release and prioritize open-telemetry/opentelemetry-java#3807. Sorry, didn't notice this ordering constraint until now.

We could still get a snapshot out sooner than later by

  • Adding log4j-logssdk-prototype after the instrumentation 1.9.0 release, and merging it in for 1.10, using the log appender API that must be ready by then
  • Adding a compileOnly dependency on log4j in the opentelemetry-sdk-logs artifact itself and put the code there.

The second option seems to allow the easiest iteration on snapshots.

@@ -5,6 +5,8 @@ plugins {
dependencies {
api(project(":testing-common"))

api("io.opentelemetry:opentelemetry-sdk-logs:1.9.0-alpha")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we need version here since we apply BOM for io.opentelemetry. Similar for all usages

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm just waiting for this project to update its BOM dependency on 1.9.0 and then will drop the version here.

@@ -3,11 +3,15 @@ plugins {
}

dependencies {
api("io.opentelemetry:opentelemetry-sdk-logs:1.9.0-alpha")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iNikem Yeah - we have open-telemetry/opentelemetry-java#3807 to track fixing this.

final class LogEventMapper {

// Visible for testing
static final AttributeKey<String> ATTR_LOGGER_NAME = AttributeKey.stringKey("logger.name");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any semantic conventions for logs defined yet, can we leave them out until they're defined?

https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/logs/semantic_conventions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If leaving any, I guess logger.name, throwable.name, throwable.message are the only ones I'd keep as they may be necessary for playing with a prototype, but others seem too detailed, or unclear (e.g., I'm not familiar with term ndc)

Copy link
Member Author

@jack-berg jack-berg Nov 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ndc = Nested Diagnostic Context

Nested diagnostic context is thread context that acts like a stack. You push items onto it, and pop them off. Mapped diagnostic context is thread context that acts like a map. Frankly, I haven't used Nested Diagnostic Context, and have only used Mapped Diagnostic Context in a limited way in applications I've written.

I took the stance of making sure that all data in the log4j2 data model somehow was mapped into the otel log data model. You're suggesting the opposite, where only a limited part of the log4j2 data model is mapped. I see pros and cons of each stance:

  • If we only do a partial mapping of fields well defined in semantic conventions, then we're less likely to make breaking changes, though we can't eliminate breaking changes since there are no stable semantic conventions. We're also going to disappoint folks using parts of the log4j2 data model that are unmapped.
  • If we do a fully mapping, we basically guarantee breaking changes, but at least all the data gets mapped. And by at least making an attempt at mapping each field we can provide better feedback for the semantic conventions.

Personally, I prefer the full mapping approach published in an "alpha" artifact. Definitely interested in other perspectives on this though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen it's easier to add an attribute and not really go anywhere in the spec with it than the opposite. By not including too much, we can hear about what users really need and then build the spec around it, rather than being ad-hoc. While I don't like to disappoint folks, actually hearing about that disappointment is very valuable feedback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That resonates with me. Also, if we were to do a complete mapping and the required semantic conventions never materialized, we may have to pull support for that mapping, which would be an unpleasant breaking change for users.

Updated to a minimalist mapping in new commit.

SpanContext.create(traceId, spanId, traceFlags, TraceState.getDefault())));
builder.setContext(context);

contextMap.forEach(attributes::put);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should fill arbitrary attributes, I don't think we ever do anything similar for spans

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of related to the the conversation about doing a complete mapping or not.

Not filling arbitrary attributes from mapped diagnostic context is limiting the usefulness of the log4j2 api. It means if you want to use otel logs in a more "machine generated events" type of way, that you'd want to use the otel log sdk in your application instead of the log4j2 api.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd generally use baggage for more arbitrary attributes? Though we haven't even specced it for spans. But I'd still suggest keeping things smaller for now to make sure the spec grows rather than going all in

@GuardedBy("LOCK")
private static final List<OpenTelemetryAppender> APPENDERS = new ArrayList<>();

public static void initialize(SdkLogEmitterProvider sdkLogEmitterProvider) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx @jack-berg!

@jack-berg
Copy link
Member Author

Can this PR be merged?

@trask
Copy link
Member

trask commented Nov 29, 2021

🎉

@trask trask merged commit 5c36ed3 into open-telemetry:main Nov 29, 2021
RashmiRam pushed a commit to RashmiRam/opentelemetry-auto-instr-java that referenced this pull request May 23, 2022
* WIP

* Use Log SDK snapshot

* Update to 1.9.0 release, refine

* Remove appenders from common log config

* Respond to PR feedback

* Update readme

* Switch to compileOnly log sdk dependency, use logger name as instrumentation name

* Switch to minimalist LogEvent mapping

* PR feedback

* PR feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants