Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not propagate gRPC deadline when propagating OTel context via javaagent. #5543

Merged
merged 2 commits into from
Mar 11, 2022

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Mar 11, 2022

Fixes #4169

Some background below but have worked around with this PR

I have been able to add a test that reproduces the problem in the issue, but I'm still not being able to figure out a fix. One problem is that per my reasoning, a cancellation error seems to actually make sense. gRPC server cancels the context when the server request is onCompleted() as expected

https://github.com/grpc/grpc-java/blob/master/core/src/main/java/io/grpc/internal/ServerCallImpl.java#L365

For this pattern of early return, where the server request is ended but business logic continues, it seems correct for that business logic to have been cancelled by default (imagine if it wasn't an early return pattern but some async callback sequence continuing after a 10s request deadline, cancellation propagation exists mostly to handle such a case), and explicit opt in to this pattern by calling Context.fork() in the business logic seems expected. So I'm not sure why without instrumentation there is no error.

I'm not convinced that there is no bug in the instrumentation, but in my head the instrumented behavior makes sense and the non-instrumented behavior doesn't, so stuck on how to dig deeper. @amitgud-doordash @ryandens do you happen to have any clues on this?

@anuraaga
Copy link
Contributor Author

Came up with a possible conclusion, adding to issue

@anuraaga anuraaga changed the title [WIP] Add test for early return in gRPC pattern. Do not propagate gRPC deadline when propagating OTel context via javaagent. Mar 11, 2022
@anuraaga anuraaga marked this pull request as ready for review March 11, 2022 08:31
@anuraaga anuraaga requested a review from a team as a code owner March 11, 2022 08:31
@ryandens
Copy link
Contributor

Thanks for the detailed analysis! I tried out a local build with this change on my reproducer project ryandens/otel-grpc-context and confirmed the error no longer shows up.

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trask trask merged commit 1d9c23b into open-telemetry:main Mar 11, 2022
RashmiRam pushed a commit to RashmiRam/opentelemetry-auto-instr-java that referenced this pull request May 23, 2022
…agent. (open-telemetry#5543)

* Add test for early return in gRPC pattern.

* Do not propagate gRPC deadline when propagating OTel context via javaagent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

io.grpc.Context getting cancelled
3 participants