Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Kotlin context element in agent instrumentation #1618

Merged
merged 2 commits into from
Nov 25, 2020

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Nov 11, 2020

Follow-on to #1617

@johnbley @iNikem mentioned he would chat with you about the approach of using context element, including the notion that makeCurrent in kotlin code is illegal and shouldn't really be supported by us. Figured having this in front of your eyes will help the discussion :)

Fixes #1411

delay(10)
a2(iter)
scope.close()
withContext(span.asContextElement()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are the key to the proposal. My theory is span.makeCurrent is illegal code - the only reason it works here is because the agent is present, but if the agent was not present, resumed coroutines could see completely unrelated contexts, which is a disaster. The idea is that our agent is here to enable tracing, not to allow illegal code. How to "codify" that it's illegal is a tricky point though but something we will want to think about. Current idea is that https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/testing/src/main/java/io/opentelemetry/sdk/testing/context/StrictContextStorage.java#L99 can also check the stacktrace to see if it is being called from a coroutine, and not from the context element. Dunno if it'll work but if it does, I think we end up in a nice place :)

Copy link
Contributor Author

@anuraaga anuraaga Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also kudos to @adriancole for giving the idea of stack-walking as a potential mechanism for checking I was stuck on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this require application authors who use auto-instrumentation and simply use (e.g.) kotlin suspend functions in their logic (without any manual instrumentation) to explicitly use (kotlin) withContext to pass our (otel) context? That is, if app logic is written in kotlin and taking advantage of auto-instrumentation of inbound and outbound connections, will this still work without the app author having to take extra steps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so - the agent with this code is making sure to automatically propagate the parent at the time the coroutine is launched into its context, so that parent is always available. If all they're doing is using libraries which are auto-instrumented by the agent, it should work fine without calling withContext. It's only when doing manual instrumentation within Kotlin that they need to use withContext instead of makeCurrent. How does that sound?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the confusion of this particular test, since it's a javaagent test, but also uses the library instrumentation directly.

do you think we need the nested spans to validate launchConcurrentSuspendFunctions? if not, we could remove a2 and b2 and not create any nested scopes at all, and not need to use the library instrumentation in the test.

if you think the nested scopes is a good test showing interop, maybe we can move it out into a separate test class ...InteropTest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nesting spans are crucial to the test - without any form of instrumentation, we were previously reporting nonsensical traces like "a causes b causes a2" and "b2 causes a".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. It's only when doing manual instrumentation within Kotlin that they need to use withContext instead of makeCurrent. How does that sound?

That sounds great; I have no problem making kotlin instrumentation authors work a little to preserve (otel) context. I actually originally tried something like this (using the kotlin coroutine context system more "naturally) but was unable to make it work. I'm glad you were able to figure it out! l'm going to try to find some time to manually validate with a "real" kotlin app and see how this change works out in practice.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnbley #1620 links to a real, though simple kotlin app that exhibits problems with span propagation

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice library instrumentation breakout 👍 and great point about the problem of makeCurrent under kotlin coroutines 😬

return context;
}

@NotNull
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is @NotNull useful here (and below)?

* and restores the previous [Context] on suspension.
*/
fun Span.asContextElement(): CoroutineContext {
return ContextElement(Java8BytecodeBridge.currentContext().with(this))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just because this is weird/surprising

Suggested change
return ContextElement(Java8BytecodeBridge.currentContext().with(this))
// Java8BytecodeBridge is needed in order to support Kotlin which generally targets Java 6 bytecode
return ContextElement(Java8BytecodeBridge.currentContext().with(this))

delay(10)
a2(iter)
scope.close()
withContext(span.asContextElement()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the confusion of this particular test, since it's a javaagent test, but also uses the library instrumentation directly.

do you think we need the nested spans to validate launchConcurrentSuspendFunctions? if not, we could remove a2 and b2 and not create any nested scopes at all, and not need to use the library instrumentation in the test.

if you think the nested scopes is a good test showing interop, maybe we can move it out into a separate test class ...InteropTest.

* Returns a [CoroutineContext] which will make this [Context] current when resuming a coroutine
* and restores the previous [Context] on suspension.
*/
fun Context.asContextElement(): CoroutineContext {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I'm wondering whether these should be in the SDK instead of instrumentation. Mainly since I was imagining adding to the javadoc for makeCurrent to not use it in Kotlin coroutines and use this instead. But that doesn't make as much sense when crossing repos.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

language-specific extensions to the API, e.g. opentelemetry-extension-kotlin? I like it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sent open-telemetry/opentelemetry-java#2066 for the library aspect of this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - this is a great approach.

@holyjak
Copy link

holyjak commented Nov 12, 2020

Great work!
I would like to use OT in a Kotlin app now(ish). Is there a workaround I can apply or is it best to wait a few days for this to be merged and then build the agent from master? Thank you!

@iNikem
Copy link
Contributor

iNikem commented Nov 17, 2020

@anuraaga What are your plans for making this real PR?

@anuraaga
Copy link
Contributor Author

@iNikem Going to do it after updating to snapshots :)

@anuraaga
Copy link
Contributor Author

Updated to use the SDK's kotlin extension

@anuraaga anuraaga merged commit 822be11 into open-telemetry:master Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KotlinProbeInstrumentation does precisely what gRPC Javadoc says not to
6 participants