Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GraphQL DataFetcher Instrumentation #10998

Merged

Conversation

hannahchan
Copy link
Contributor

@hannahchan hannahchan commented Apr 1, 2024

This PR refactors the GraphQL 20.0 instrumentation to introduce the ability to create spans for DataFetchers.

At a high-level;

  • Added GraphQLInstrumeterFactory
  • Updated GraphQLTelemetryBuilder, GraphQLTelemetry with the new options; createSpansForDataFetchers and createSpanForTrivialDataFetchers
  • Updated OpenTelemetryInstrumentation with logic to create the actual spans for DataFetchers
  • Updated OpenTelemetryInstrumentationState to correctly track the OpenTelemetry Context for DataFetchers

I was unable to make these changes compatible with previous versions of graphql-java.

@hannahchan hannahchan force-pushed the graphql-datafetching-instrumentation branch 2 times, most recently from 3bf6c8e to 1db723b Compare April 14, 2024 08:00
@hannahchan hannahchan force-pushed the graphql-datafetching-instrumentation branch 2 times, most recently from e0a705b to e8e263b Compare April 21, 2024 13:33
@hannahchan hannahchan marked this pull request as ready for review April 21, 2024 13:44
@hannahchan hannahchan requested a review from a team as a code owner April 21, 2024 13:44
@hannahchan hannahchan force-pushed the graphql-datafetching-instrumentation branch from e8e263b to 5c496d2 Compare April 24, 2024 13:18
Comment on lines 131 to 135
Node<?> node = operationDefinition;
if (sanitizeQuery) {
node = sanitize(node);
}
state.setQuery(AstPrinter.printAst(node));
Copy link
Contributor Author

@hannahchan hannahchan May 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still yet to verify this but suspect this block is the source of performance regressions in our GraphQL endpoint in our production environment. I'm unable to turn this off as this needs to be sanitised.

We are seeing regressions of around 10-100+ milliseconds added to our GraphQL operations when the OpenTelemetry GraphQL instrumentation is turned on. I suspect traversing the AST is causing the regression with larger GraphQL documents resulting in larger regressions.

I think this block needs to be surrounded with a check on span.isRecording() to at least not impact performance if the span is not sampled.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we could find a way to off load this onto another thread to do this async as to not block the execution of the GraphQL operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can then please do verify. If we can reproduce this then we can figure out a better strategy to handle it. If the query is really large then perhaps we can only handle the start of the query and truncate most of it.

@hannahchan hannahchan requested a review from laurit May 5, 2024 12:37
@hannahchan
Copy link
Contributor Author

@laurit

What else do you think I need to do in order to get this over the line? Are there any other concerns that need to be addressed?

@hannahchan hannahchan force-pushed the graphql-datafetching-instrumentation branch from 293d9ed to 1431f52 Compare May 5, 2024 12:39
@laurit
Copy link
Contributor

laurit commented May 7, 2024

@laurit

What else do you think I need to do in order to get this over the line? Are there any other concerns that need to be addressed?

@hannahchan I have create a PR hannahchan#1 for your PR that should get it closer to acceptable

@hannahchan hannahchan force-pushed the graphql-datafetching-instrumentation branch from b45ff21 to a52e219 Compare May 10, 2024 12:38
@hannahchan
Copy link
Contributor Author

hannahchan commented May 11, 2024

@laurit
What else do you think I need to do in order to get this over the line? Are there any other concerns that need to be addressed?

@hannahchan I have create a PR hannahchan#1 for your PR that should get it closer to acceptable

Ah I missed this. I'll take a look and incorporate the changes as soon as I can.

Thanks.

@hannahchan hannahchan force-pushed the graphql-datafetching-instrumentation branch from ce57745 to 6e80358 Compare May 12, 2024 12:35
@github-actions github-actions bot requested a review from theletterf May 12, 2024 12:35
@hannahchan hannahchan force-pushed the graphql-datafetching-instrumentation branch from 6e80358 to f6f44bf Compare May 12, 2024 12:41
@hannahchan
Copy link
Contributor Author

@laurit
What else do you think I need to do in order to get this over the line? Are there any other concerns that need to be addressed?

@hannahchan I have create a PR hannahchan#1 for your PR that should get it closer to acceptable

Ah I missed this. I'll take a look and incorporate the changes as soon as I can.

Thanks.

I've merged the above PR into my branch via the command line and updated the unit tests to make sure the data fetcher spans and structure is what we expect it to be under various conditions. I think this is ready to be merged unless there are other things that need to be addressed.

@laurit laurit added this to the v2.4.0 milestone May 13, 2024
@trask trask merged commit 692739a into open-telemetry:main May 16, 2024
53 checks passed
@hannahchan hannahchan deleted the graphql-datafetching-instrumentation branch May 16, 2024 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants