Why AWS Lambda Throttles Functions (and how to fix it!)

By Mark Siebert on October, 4 2018

Have you noticed recently that your AWS Lambda invocation requests are getting throttled? If so, your Lambda functions are probably not running as designed. Let’s examine the possible causes and solutions to poor Lambda performance.

Lambda Concurrency Limit


Each region in your AWS account has a Lambda concurrency limit. The concurrency limit determines how many function invocations can run simultaneously in one region. The limit applies to all functions in the same region and is set to 1000 by default.


If you exceed a concurrency limit, Lambda starts throttling the offending functions by rejecting requests. Depending on the invocation type, you’ll run into the following situations:

Synchronous invocations

Invocation sources: API Gateway, Cloudfront, On demand

AWS returns a 429 status code. The request is not retried.

Asynchronous invocations

Invocation sources: S3, SNS, SES

AWS retries the invocation twice. If both attempts are unsuccessful, the request is not retried. If you configure a dead letter queue, the request gets sent there.

You can configure a dead letter queue in the console for the function.

You can configure a dead letter queue in the console for the function.

Polling, stream-based invocations

Invocation sources: DynamoDB, Kinesis

AWS retries the invocation until the retention period expires.

Polling, stream-based invocations

Invocation sources: SQS

AWS retries Lambda reattempts the connection until the queue’s conditions for message deletion are met.


Whatever the invocation source, repeated throttling by AWS Lambda can result in your requests never actually being run—showing up in production as a bug.

Lambda Concurrency Limit Configuration Options

How you should change the default concurrency limit will depend on how your Lambda functions are configured. We’ll go over both configuration options: allocating your concurrency limit by function (function-level concurrency limit configuration) or jump down to troubleshooting the unreserved concurrency limit.

What is a Function-Level Concurrency Limit Configuration?

A function-level concurrency limit is exactly what it sounds like, a reserved number of concurrent executions set aside for a specified function. It’s a configuration option that AWS Lambda makes available to you through the console.

Why AWS Lambda Throttles Functions

Here’s an example to illustrate how function-level allocations work: If your account limit is 1000 and you reserved 200 concurrent executions for a specific function and 100 concurrent executions for another, the rest of the functions in that region will share the remaining 700 executions.


If you haven’t configured limits for any Lambda functions, jump to unreserved allocation troubleshooting.


Important note: AWS Lambda limits the total amount of concurrency you can reserve across all functions in one region. Lambda requires at least 100 unreserved concurrent executions per account. If you are bumping up against your account limit, you can read more about how to increase your account limit here.

Troubleshooting a Function with Function-Level Allocation

If you reserve concurrent executions for a specific function, AWS Lambda assumes that you know how many to reserve to avoid performance issues. Functions with allocated concurrency can’t access unreserved concurrency. If your function is being throttled, you can raise the concurrency allocated to the function.


When you plan to increase the concurrency limit for a function, check to see if the functions sharing the unreserved concurrency allocation are getting close to that limit, as increasing reserved concurrency will lower the amount of available unreserved concurrency. You perform the check by accessing CloudWatch’s UnreservedConcurrentExecutions metric in the AWS/Lamba namespace. 


If you don’t have unreserved concurrency to spare, compare the reserved amount for each other function with the values for the Invocations metric in CloudWatch. If the function’s invocations never approaches its reserved concurrency, it’s overallocated, and you can lower your reservation. This will give you enough room to increase the allocation for the function you’re troubleshooting.

Troubleshooting a Function Using Unreserved Concurrency
If your function is drawing from the pool of unreserved concurrent executions, determine what is using up your unreserved concurrency allocation.


To do so, access the UnreservedConcurrentExecutions metric in CloudWatch to determine if you are consistently (or at regular intervals during bursty workloads) bumping up against your unreserved concurrency limit. If you are nearing that limit, you can increase unreserved concurrency by reducing function-level limits, or request a higher account concurrency limit.

The unreserved concurrency pool is being exhausted consistently.

The unreserved concurrency pool is being exhausted consistently.

Occasional bursts in invocations suggest one of your functions has a bursty workload.

Occasional bursts in invocations suggest one of your functions has a bursty workload.


Otherwise, look for a bursty function that is periodically spiking and using up the unreserved concurrency pool. Check the Invocations metric for each function in Cloudwatch and find your bursty function.

In this graph, the function graphed in blue is bursting occasionally and using up the unreserved concurrency pool.

In this graph, the function graphed in blue is bursting occasionally and using up the unreserved concurrency pool.


Once you identify a bursty function, you can start by looking at the upstream event source; if that search is unfruitful, you’ll want to do some function-level allocation housekeeping. You have a couple options:

  • Identify the functions that absolutely cannot be throttled, and reserve concurrency for them.
  • Reserve a portion of your concurrency limit for your bursty function that will guarantee your other functions can still use the remaining unreserved capacity and not be throttled. This will result in the bursty function being throttled when it would otherwise spike.

Increasing Your Account Level Lambda Concurrency Limit

Open a support ticket with AWS to request an increase in your account level concurrency limit. 

  1. Create a new support case
  2. Set Regarding value to Service Limit Increase.
  3. Choose Lambda as the Limit Type.
  4. Fill out the body of the form.
  5. Wait for AWS to respond to your request.

They can increase your limit so you will be able to run more Lambda functions concurrently.


Blue Matador provides cloud monitoring services that seamlessly integrate with 13 AWS services.


Make monitoring Lambda easier
Proactive, automated alerts