Most of the processing in applications is done in user space. Processes switch to kernel space to manage memory, make system calls, access drivers, perform I/O operations, and manage shared resources. The amount of time spent in this second space is measured as system CPU (Linux) or privileged CPU (Windows).

    To see the kernel space CPU time, run  top  and look for %sys on Linux. For Windows, open the Performance tab of the Task Manager, then enable the menu item View > Show Kernel Items.

    Every application has a need to run in kernel space at least part of the time — to manage memory if nothing else. However, there is an unhealthy level of system CPU that’s specific to each application and system.



    When a system reaches this unhealthy threshold, it’s difficult to recover, and you’ll experience these symptoms:

    • Processes will be taking more CPU, but will be more sluggish than normal
    • Response times for queries and API calls will be slower
    • The server will have a higher load average despite usual traffic


    Quick Fix

    Restart the processes taking up the most CPU. Chances are that they’re the culprits. Make sure you’re not killing processes that are single points of failure. If restart isn’t an option, add more capacity.


    Thorough Fix

    Isolate which process is using the most system CPU. Then use  wt  (Windows) and   strace  (Linux) to identify which system calls are being made the most often. Fix code, configuration, or both to make fewer calls. If that’s not possible, try to be more efficient in how you implement the system calls to reduce their use.