Blue Matador monitors the Percentage CPU metric on your Azure VMs to create events when CPU utilization is anomalous or consistently high.
The Percentage CPU metric is measured as a percentage of your VMs provisioned CPU capacity and may not match up with the CPU usage displayed from your guest OS. For example if you have a VM with 8 vCPU provisioned, and a process fully utilizing 6 of those CPU, your Percentage CPU would be 75%, but using Linux’s top command you would see 600% CPU usage.
Anomalous CPU usage can be the result of a change in traffic patterns going to the VM, a system update, or a change in the performance of a downstream API. Sustained CPU utilization near 100% can be acceptable for some applications, but will cause issues with others. Your VM may exhibit longer response times or inability to accept connections.
In order to fix high CPU utilization, you must figure out what is using the CPU. The best way to do this is to connect to the VM via SSH or Remote Desktop and look at the process list to determine what is consuming your CPU. If there is a rogue process using CPU you can terminate it, and if OS updates are causing issues you can try scheduling them during a time that will be less impactful to your system.
If your application is the culprit of increased CPU utilization, it could take considerable effort to debug and find the issue. In the meantime you can increase capacity temporarily while your developers look into the issue. If your VMs are receiving load-balanced traffic in an Availability Set, you can add more VMs to receive traffic, reducing the share of work each VM has to perform. Otherwise you can increase the size of your VMs to a type that has more vCPU.
For memory-intensive applications running on the JVM, high CPU usage can actually be a symptom of high memory usage. Check your JVM configuration, especially regarding the HEAP_SIZE and see if the CPU usage is caused by JVM garbage collection.