Our comprehensive, automated AWS monitoring provides peace of mind 

Below are the services, metrics and events currently monitored by Blue Matador. New integrations and monitorable events are being developed everyday, so please check back often. If you're looking for a specific monitored service or integration not listed, please get in touch and we will prioritize it.



Certificate Expiring
Upcoming SSL Certificate Expiration

AWS Autoscaling

AWS Autoscaling

AutoScaling Capacity
Number of launched servers is less than number needed 



AWS Event
Upcoming scheduled maintenance events

EC2 Credit Balance
CPU credit usage and credits running low 

EC2 Instance Limit
Approaching limit of instances per region 

EC2 Status Check
Automatic health checks on instance configuration and underlying hardware 

Disk IO
Unexpected changes in disk IOPS 

Disk Latency
Deviations in disk latency 

Network IO
Rates of bytes and packets on the network 


AWS Elastic Beanstalk

Beanstalk Events
Negative events due to environment changes

Beanstalk Health
Automatic health status of your Beanstalk environments 

Beanstalk Latency
Increase in application latency 

Beanstalk Environment Pending
Bootstrapping process stuck for a significant amount of time 

Beanstalk Requests
Inconsistent spike or drop in the ApplicationRequestsTotal metric


AWS Cloudfront

Cloudfront 4xx
Unhealthy percentage of requests that result in a 4xx response 

Cloudfront 5xx
Unhealthy percentage of responses with 5xx response codes 

Cloudfront Request Count
Anomalous spike in request counts 

Cloudfront Data Transfer
Unexpected increase in data transferred to CloudFront 


Linux / Windows

CPU Iowait
I/O wait time – CPU idle time with an outstanding disk I/O operation requested 

CPU Steal
Time spent waiting for a real CPU while hypervisor services another virtual processor 

CPU System
Unhealthy level of system CPU (i.e. processes switched to kernel space) 

Disk Inodes
Running out of disk inodes 

Disk Space
Approaching disk space limitations

High Load
Prolonged or abnormally high normalized load 

Dropped Packets
Unhealthy number of dropped packets 

Network Errors
Network errors over a sustained period of time 

Open Files Ulimit
Unable to open files or network sockets 

Threads Ulimit
Unable to spawn new sysV threads

Server runs out of RAM, the OS will use swap space as memory 

Server Time Drift
Server’s time does not match an authoritative time 

Server Unresponsive
Heartbeat created between server and Blue Matador's agent 

Disk IO
Unexpected changes in disk IOPS 

Disk Latency
Deviations in disk latency 

Network IO
Rates of bytes and packets on the network 


AWS DynamoDB

DynamoDB Capacity
Risk of throttling due to insufficient provisioned capacity

DynamoDB Errors
User errors (HTTP 4xx status codes) or system errors (HTTP 5xx status codes)

DynamoDB Latency
Amount of time successful requests take 

DynamoDB Throttles
Unusual throttling due to partition capacity 



EBS Burst Balance
Approaching burst balance capacity 

EBS IOPS Consumed
Nearing the limit of volume for IOPS usage 

EBS Queue Length
Anomalies on the average queue length of a volume 

EBS IOPS Throughput
Lower IOPS throughput than expected 

EBS Volume State
EBS volume in Error state 

EBS Volume Status
Volume state becomes Warning or Impaired



ELB 400s
Detection of anomalous 400s 

ELB 500s
Detection of anomalous 500s 

ELB Backend Errors
Lack of connection between the load balancer and the host 

ELB Bytes Processed
Anomalies with number of bytes processed 

ELB Unhealthy Hosts
Fewer available targets that can receive traffic than expected 

ELB Latency
Increase in latency 

ELB No Registered Hosts
No registered instances in load balancer 

ELB Region Limits
Nearing limits per-region limits 

ELB Request Count
Anomalous increase/decrease in request count 

ELB Surge Queue
Unusually high surge queue length


AWS Kinesis

Kinesis Incoming Records
Number of records put to a Kinesis stream 

Kinesis Throttling
Throttling due to exceeding limits


AWS Lambda

Lambda Dead Letter Errors
Errors sending the event payload to the dead letter queue 

Lambda Function Duration
Anomaly detection when function duration changes 

Lambda Errors
Lambda function results in an error 

Lambda Invocations
Fluctuations in function invocations 

Lambda Iterator Age
Time between when a record is written to the stream and when Lambda reads it 

Lambda Throttling
Throttling due to concurrency limit 



Kubernetes API Health
Health checks of Kubernetes API 

Kubernetes Component Statuses
Health status of essential Kubernetes components 

Kubernetes DaemonSet Unhealthy
Unhealthy number of DaemonSets 

Kubernetes Deployment Unhealthy
Monitoring for pod scheduling and life cycles 

Kubernetes Job Failed
Jobs running in cluster fail 

Kubernetes Node Conditions
Health of several key node metrics 

Kubernetes Node Pod Capacity
Approaching pod limit for each node 

Kubernetes Node Resources
Capacity tracking for CPU and memory allocated to each pod 

Kubernetes Failed Volume Mount
Failed start due to inability of pod to mount all volumes 

Kubernetes OOM Containers
Container out of memory 

Kubernetes Container Restarts
Restarting containers 

Kubernetes Pod Terminating
Pod stuck in terminating state 

Kubernetes Pod Pending
Pod could not be scheduled 

Kubernetes Waiting Containers
Pod stuck in pending state

Kubernetes Service Without Endpoints
Service has no defined endpoints



RDS Cluster Status
Health status of Aurora cluster 

RDS Commit Latency
Database query latency (commit) 

RDS Select Latency
Database query latency (select) 

RDS Commit Throughput
Anomalous number of commit operations database is handling 

RDS Select Throughput
Anomalous number of select operations database is handling 

RDS Connections
Nearing maximum connections to the database 

RDS CPU Utilization
Percent of instance's CPU being consumed 

RDS Deadlocks
Transactions hold locks that other transactions require 

Anomalous disk throughput 

RDS Free Memory
Amount of unused memory on a database instance 

RDS Instance Status
Health of RDS database instance 

RDS Network IO
Anomalous network traffic to and from each DB instance 

RDS Replica Lag
How far an Aurora replica’s data is behind the data in the primary instance 

RDS Restore Time
Latest point in time to create copy of database falls behind 

RDS Event
Upcoming scheduled maintenance events 


AWS Route53

Route53 Domain Expiring
Notifications when domains are 30 days from expiration 

Route53 Health Check
Health of web applications, CloudWatch alarms, or even other health checks 

Route53 Zone NS
Check for the configured NS Record for a hosted zone against the public DNS lookup for the domain 



S3 Cors Changed
Change in CORS configuration 

S3 Policy Changed
Changes in the bucket policy 

S3 Replication Bucket
Bucket is configured to replicate to a bucket that does not exist 

S3 Website Changed
Changes to static website configuration



SES Bounces
Bounce rate compared to acceptable values 

SES Complaints
Complaint rates exceeding acceptable values 

SES Domain
Verification status and DKIM settings for all of your verified SES domains 

SES Quota
Per-second and per-day message sent limits 

SES Rejects
Rate of rejected messages 

SES Sends
Number of sends compared to number of deliveries  



SNS Failed Notifications
Message failed to be sent to subscriber 

SNS Published Messages
Unexpected number of messages were published



SQS Nonempty Dead Letter Queue
Dead letter queue is nonempty resulting in failed messages 

SQS Dead Letter Retention
Measurement of retention period length relative to retention period length

SQS Delay Retention
Configuration of message delay to message retention 

SQS Inflight Messages Limit
Messages received by a consumer but not yet deleted 

SQS FIFO Operations Limit
FIFO queues are nearing the per second limit 

SQS Max Receives
Configuration of maximum receives 

SQS Message Size
Approaching documented limits 

SQS Messages Sent
Anomalies in the number of messages sent to an SQS Queue



ECS Cluster Resource Utilization
Approaching 100% CPU utilization or memory utilization 

ECS Service Resource Utilization
Warning when approaching 100% utilization 

ECS Running Tasks
Fewer running tasks than expected

ECS Task Connectivity
Task connected ECS or not

ECS Task Health
Health of running tasks

ECS Task Stopped
Task unexpectedly stops

ECS Task Pending
Tasks taking a significant amount of time to enter the running state

Blue Matador has multiple dashboards for proactive monitoring

Request Demo