Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-top level server span created by lambda instrumentation #7808

Closed
tylerbenson opened this issue Feb 13, 2023 · 15 comments · Fixed by #10736
Closed

Non-top level server span created by lambda instrumentation #7808

tylerbenson opened this issue Feb 13, 2023 · 15 comments · Fixed by #10736
Labels
bug Something isn't working contribution welcome Request makes sense, maintainers probably won't have time, contribution would be welcome

Comments

@tylerbenson
Copy link
Member

Describe the bug
When using javaagent with AWS Lambda application, instrumentation produces two server spans.
Both of these are applying:
io.opentelemetry.javaagent.instrumentation.awslambdacore.v1_0.AwsLambdaRequestHandlerInstrumentation
io.opentelemetry.javaagent.instrumentation.awslambdaevents.v2_2.AwsLambdaRequestHandlerInstrumentation

Steps to reproduce
Deploy and invoke an sqs lambda handler with the javaagent.

Here's my example:

public class ExampleSQSHandler implements RequestHandler<SQSEvent, String> {
    private static final String TOPIC_ARN = System.getenv("TOPIC_ARN");

    private final SnsClient client = SnsClient.create();

    @Override
    public String handleRequest(SQSEvent event, Context context) {
        PublishRequest.Builder publishBuilder = PublishRequest.builder().topicArn(TOPIC_ARN);
        List<SQSEvent.SQSMessage> records = event.getRecords();
        for (SQSEvent.SQSMessage record : records) {
            String body = record.getBody();
            client.publish(publishBuilder
                    .message("{\"example-success\":" + body + "}")
                    .build());
        }
        // log execution details
        Util.logEnvironment(event, context);
        return "Success!";
    }
}

What did you expect to see?
Only one server span.

What did you see instead?
Two server spans were created.

What version are you using?
1.22.1

Environment
Lambda java11 runtime

@tylerbenson tylerbenson added the bug Something isn't working label Feb 13, 2023
@breedx-splk breedx-splk added the contribution welcome Request makes sense, maintainers probably won't have time, contribution would be welcome label Jul 10, 2023
@cleverchuk
Copy link
Contributor

Try these environment variables combination for a workaround

OTEL_INSTRUMENTATION_AWS_LAMBDA_EVENTS_ENABLED=false
OTEL_INSTRUMENTATION_AWS_LAMBDA_CORE_ENABLED=true

Ensure that OTEL_INSTRUMENTATION_AWS_LAMBDA_ENABLED is not set.

@serkan-ozal
Copy link
Contributor

Hi @tylerbenson @cleverchuk @breedx-splk,

Seems that this has been already fixed in the PR #9654: https://github.com/open-telemetry/opentelemetry-java-instrumentation/pull/9654/files#diff-47940a3adcec75b6ac500a403a33175706220b56017a6ca3034609a1b9bf5ebdR24-R29

Either one of the AwsLambdaRequestHandlerInstrumentation and AwsLambdaRequestHandlerInstrumentation is activated depending on com.amazonaws.services.lambda.runtime.events.SQSEvent is in the classpath or not.

I have also verified no double span is created with Java agent in AWS Lambda since v1.31.0 version

@cleverchuk
Copy link
Contributor

@serkan-ozal can you provide your repro setup? I'm using this sample and still see the issue with 1.32.0

@serkan-ozal
Copy link
Contributor

@cleverchuk

Then when I trigger the function, it produces single AWS Lambda span from io.opentelemetry.aws-lambda-events-2.2 instrumentation library as expected. Because com.amazonaws.services.lambda.runtime.events.SQSEvent is in the classpath and so io.opentelemetry.aws-lambda-core-1.0 is deactivated.

Here it my simple Lambda function I have used for testing this issue:

public class TestHandler implements RequestHandler<Map, Void> {

    @Override
    public Void handleRequest(Map event, Context context) {
        System.out.println("Handling event: " + event);
        return null;
    }

}

@cleverchuk
Copy link
Contributor

cleverchuk commented Feb 14, 2024

@serkan-ozal Thanks. I'm not seeing the same thing. Please, can you try using the sample aws-sdk app as your lambda function?

  • Runtime: Java 17
  • Architecture: arm64
  • otel-version: 1.32.0

@cleverchuk
Copy link
Contributor

Try these environment variables combination for a workaround

OTEL_INSTRUMENTATION_AWS_LAMBDA_EVENTS_ENABLED=false
OTEL_INSTRUMENTATION_AWS_LAMBDA_CORE_ENABLED=true

Ensure that OTEL_INSTRUMENTATION_AWS_LAMBDA_ENABLED is not set.

This workaround is flawed as the two instrumentation provide different capabilities which might be necessary for your deployment. So take it with a grain salt.

@serkan-ozal
Copy link
Contributor

serkan-ozal commented Feb 19, 2024

Hi @cleverchuk,

I was able to reproduce the problem both with the examples you provided and my own example. The double span is produced (and their trace ids are different too) when handler method is specified.

  • It is OK (single AWS Lambda span) when the handler is specified as io.opentelemetry.lambda.sampleapps.awssdk.AwsSdkRequestHandler without handler method
  • But problematic (double AWS Lambda span) when the handler is specified as io.opentelemetry.lambda.sampleapps.awssdk.AwsSdkRequestHandler::handleRequest with handler method

At the moment, not fully sure about the reason of the problem yet (need to verify that), but my theory is that the very first AWS Lambda span (io.opentelemetry.aws-lambda-core-1.0) is created by the instrumentation of AWS Lambda Runtime API's internal class PojoMethodRequestHandler (https://github.com/aws/aws-lambda-java-libs/blob/3d8dfb66f3a852bf69618bd5f66222692f1f5a49/aws-lambda-java-runtime-interface-client/src/main/java/com/amazonaws/services/lambda/runtime/api/client/EventHandlerLoader.java#L223).

In the case here,

I will look into further to verify this behavior. The simplest (not sure it is the correct one yet) solution might be ignoring classes starts with com.amazonaws.services.lambda.runtime.api.client in the typeMatcher of the io.opentelemetry.javaagent.instrumentation.awslambdacore.v1_0.AwsLambdaRequestHandlerInstrumentation.

@cleverchuk What are your thoughts on the problem and solution I have suggested?

@serkan-ozal
Copy link
Contributor

serkan-ozal commented Feb 19, 2024

@cleverchuk Or we can let the trace start with runtime level and enrich it (don't start new span, but reuse the existing one) based on event in the io.opentelemetry.aws-lambda-events-2.2.

@cleverchuk
Copy link
Contributor

@serkan-ozal Thanks for the continued research into this. I have been banging head at this for sometime and I can't seem to understand what's is going on. Your theory seems very sound to me because whenever I see the double spans I always see the instrumentation getting applied twice, first on the system loader then second on com.amazonaws.services.lambda.runtime.api.client.CustomerClassLoader
Screenshot 2024-02-23 at 9 19 38 AM

@cleverchuk
Copy link
Contributor

@serkan-ozal Made a custom build with your suggested modification as below and it seems like it is working okay.

  @Override
  public ElementMatcher<TypeDescription> typeMatcher() {
    return implementsInterface(named("com.amazonaws.services.lambda.runtime.RequestHandler"))
        .and(not(nameStartsWith("com.amazonaws.services.lambda.runtime.api.client")));
  }

@serkan-ozal
Copy link
Contributor

@cleverchuk Perfect! Couldn't find time to validate the theory. Thanks for looking into this

@cleverchuk
Copy link
Contributor

@serkan-ozal How about a PR when you have time?

@serkan-ozal
Copy link
Contributor

@cleverchuk Sure. I would love to. This will be my first contribution! Looking forward to it.

@tylerbenson
Copy link
Member Author

@serkan-ozal thanks for looking into this. The change mentioned in #9654 does indeed look like it should solve my original problem. I've been out for a bit and don't have time to validate though.

@serkan-ozal
Copy link
Contributor

@cleverchuk @tylerbenson Here it is: #10736

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working contribution welcome Request makes sense, maintainers probably won't have time, contribution would be welcome
Projects
None yet
4 participants