-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Hy there,
we're seeing some strange effects with the latest extension versions that we can't really explain.
The function execution takes significantly longer than expected - apparently due to overhead related to tracing/metrics so that the total time of execution hits our 3s function timeout for our typical 0.5-1.5s workload.
Hitting the timeout results in lambda instance shutdown and subsequent cold starts.
There are a couple of ways to turn on/off this "behavior" with v90 and v91, but I've boiled down my function code to a minimal handler that sleeps for 1s before returning a positive result (to the AWS LoadBalancer that triggered it). It uses DD extension layer v91 .
Expected would be a cold start (e.g. 500ms) and then execution time of slightly above 1000ms, but xray shows times closer to 4000ms!
I can "fix" this minimal function by:
- increasing RAM (e.g. to 192 MB) or
- increasing the timeout to 5s or
- disabling DD_ENHANCED_METRICS and DD_TRACE_ENABLED (one OR the other is not sufficient)
All of which is not really desired and should not be necessary.
My lambda function is
- written in Golang,
- uses runtime provided.al2023
- is happy with 128 MB ram (logs say Max Memory Used: 78 MB)
- running in a VPC (with internet connectivity)
- tiggered by ALB requests
- in region eu-central-1
The env variables set are (among others)
- DD_API_KEY_SECRET_ARN | arn:aws:secretsmanager:….
- DD_ENHANCED_METRICS | true
- DD_RUNTIME_METRICS_ENABLED | true
- DD_SERVERLESS_LOGS_ENABLED | false
- DD_SITE | datadoghq.com
- DD_TRACE_ENABLED | true
The minimal handler function that I use to reproduce this is:
import (
"context"
"time"
ddlambda "github.com/DataDog/dd-trace-go/contrib/aws/datadog-lambda-go/v2"
"github.com/aws/aws-lambda-go/events"
"github.com/aws/aws-lambda-go/lambda"
)
func main() {
handler := func(ctx context.Context, req events.ALBTargetGroupRequest) (events.ALBTargetGroupResponse, error) {
time.Sleep(1000 * time.Millisecond)
return events.ALBTargetGroupResponse{
StatusCode: 200,
StatusDescription: "200 OK",
Body: "{\"message\": \"Hello world\"}",
MultiValueHeaders: map[string][]string{
"Content-Type": {"application/json"},
},
IsBase64Encoded: false,
}, nil
}
lambda.Start(ddlambda.WrapFunction(handler, nil))
}
My actual function is still on extension v90 and I can see a similar pattern if the function code initializes very quickly (e.g. without any network I/O. Adding a 100ms sleep during initialization (not in the handler) "fixed" this there, but this trick does not work with v91 anymore.
Thanks for any hints!
Jochen
Jochen Mehlhorn [email protected], Mercedes-Benz Tech Innovation GmbH
Provider Information