Function concurrency limits and throttling in AWS Lambda

By Mark Siebert on October 14, 2020

Have you noticed recently that your AWS Lambda invocation requests are getting throttled? If so, your Lambda functions are probably not running as designed. Let’s examine the possible causes of and solutions to poor Lambda performance.

AWS Lambda limits: why you're being throttled

AWS Lambda concurrency limits

In AWS Lambda, a concurrency limit determines how many function invocations can run simultaneously in one region. Each region in your AWS account has a Lambda concurrency limit. The limit applies to all functions in the same region and is set to 1000 by default.

See how Blue Matador monitors AWS, including Lambda limits >

If you exceed a concurrency limit, Lambda starts throttling the offending functions by rejecting requests. Depending on the invocation type, you’ll run into the following situations:

Synchronous invocations

Invocation sources: API Gateway, Cloudfront, On demand

AWS returns a 429 status code. The request is not retried.

Asynchronous invocations

Invocation sources: S3, SNS, SES

AWS retries the invocation twice. If both attempts are unsuccessful, the request is not retried. If you configure a dead letter queue, the request gets sent there.

You can configure a dead letter queue in the console for the function.

You can configure a dead letter queue in the console for the function.

Polling, stream-based invocations

Invocation sources: DynamoDB, Kinesis

AWS retries the invocation until the retention period expires.

Polling, stream-based invocations

Invocation sources: SQS

AWS Lambda reattempts the connection until the queue’s conditions for message deletion are met.

Whatever the invocation source, repeated throttling by AWS Lambda can result in your requests never actually being run—showing up in production as a bug.

How to configure AWS Lambda limits

How you should change the default concurrency limit will depend on how your Lambda functions are configured. We’ll go over both configuration options: allocating your concurrency limit by function (function-level concurrency limit configuration) or jump down to troubleshooting the unreserved concurrency limit.

What is a function-level concurrency limit configuration?

A function-level concurrency limit is exactly what it sounds like, a reserved number of concurrent executions set aside for a specified function. It’s a configuration option that AWS Lambda makes available to you through the console.

Why AWS Lambda Throttles Functions

Here’s an example to illustrate how function-level allocations work: If your account limit is 1000 and you reserved 200 concurrent executions for a specific function and 100 concurrent executions for another, the rest of the functions in that region will share the remaining 700 executions.

If you haven’t configured limits for any Lambda functions, jump to unreserved allocation troubleshooting.

Important note: AWS Lambda limits the total amount of concurrency you can reserve across all functions in one region. Lambda requires at least 100 unreserved concurrent executions per account. If you are bumping up against your account limit, you can read more about how to increase your account limit here.

Troubleshooting a function with function-level allocation

If you reserve concurrent executions for a specific function, AWS Lambda assumes that you know how many to reserve to avoid performance issues. Functions with allocated concurrency can’t access unreserved concurrency. If your function is being throttled, you can raise the concurrency allocated to the function.

When you plan to increase the concurrency limit for a function, check to see if the functions sharing the unreserved concurrency allocation are getting close to that limit, as increasing reserved concurrency will lower the amount of available unreserved concurrency. You perform the check by accessing CloudWatch’s UnreservedConcurrentExecutions metric in the AWS/Lambda namespace.  

If you don’t have unreserved concurrency to spare, compare the reserved amount for each other function with the values for the Invocations metric in CloudWatch. If the function’s invocations never approaches its reserved concurrency, it’s overallocated, and you can lower your reservation. This will give you enough room to increase the allocation for the function you’re troubleshooting.

Troubleshooting a function using unreserved concurrency

If your function is drawing from the pool of unreserved concurrent executions, determine what is using up your unreserved concurrency allocation.

To do so, access the UnreservedConcurrentExecutions metric in CloudWatch to determine if you are consistently (or at regular intervals during bursty workloads) bumping up against your unreserved concurrency limit. If you are nearing that limit, you can increase unreserved concurrency by reducing function-level limits, or request a higher account concurrency limit.

Troubleshooting a Function Using Unreserved Concurrency

The unreserved concurrency pool is being exhausted consistently.

Occasional bursts in invocations suggest one of your functions has a bursty workload.

Occasional bursts in invocations suggest one of your functions has a bursty workload.


Otherwise, look for a bursty function that is periodically spiking and using up the unreserved concurrency pool. Check the Invocations metric for each function in Cloudwatch and find your bursty function.

In this graph, the function graphed in blue is bursting occasionally and using up the unreserved concurrency pool.

In this graph, the function graphed in blue is bursting occasionally and using up the unreserved concurrency pool.


Once you identify a bursty function, you can start by looking at the upstream event source; if that search is unfruitful, you’ll want to do some function-level allocation housekeeping. You have a couple options:

  • Identify the functions that absolutely cannot be throttled, and reserve concurrency for them.
  • Reserve a portion of your concurrency limit for your bursty function that will guarantee your other functions can still use the remaining unreserved capacity and not be throttled. This will result in the bursty function being throttled when it would otherwise spike.

How to increase your AWS Lambda concurrency limits

Open a support ticket with AWS to request an increase in your account level concurrency limit. 

  1. Create a new support case
  2. Set Regarding value to Service Limit Increase.
  3. Choose Lambda as the Limit Type.
  4. Fill out the body of the form.
  5. Wait for AWS to respond to your request.

They can increase your limit so you will be able to run more Lambda functions concurrently.

Be the first to know of throttling, errors, invocations, timeouts, etc., with your Lambda functions. Blue Matador is the fastest and easiest way to monitor your AWS environment. It's like having a DevOps engineer that doesn't sleep.

Set up a free account >

What to read next

Subscribe to get blog updates in your inbox.

Learn how to set up Cloudwatch monitoring across EC2, Lambda, Kinesis, and more.