How to Diagnose and Fix High CPU Utilization in Linux

Photo of author
By Jay
— 8 min read
Photo of author
Written by
Photo of author
Verified by
Published On
— 8 min read

High CPU utilization in Linux

High CPU utilization in Linux systems can be a frustrating challenge, especially when it impacts the performance of critical applications or services. Whether you’re a system administrator or a curious tech enthusiast, understanding how to identify and address this issue is crucial for maintaining a stable and efficient system.

In this guide, we’ll walk you through the common causes of high CPU usage in Linux, the tools you can use to diagnose the problem, and practical steps to resolve it. By the end, you’ll have the knowledge to not only troubleshoot high CPU utilization in linux but also to implement strategies to prevent it in the future. Let’s dive in!

There will be many causes behind the high CPU utilization. Let’s begin with some troubleshooting steps to find the reason behind this.

There will be two scenarios for high CPU Utilization in Linux.

High CPU utilization can occur in two scenarios:

  1. The system is currently experiencing high CPU usage.
  2. You need to investigate the cause of high CPU usage that occurred at a specific time, such as during the past x days or y hours.

For now, let’s focus on diagnosing a system that is currently utilizing high CPU resources.

Start by running the top command in your terminal. This tool provides a real-time overview of system processes. To better analyze the situation, arrange the processes in descending order of CPU usage, allowing you to quickly identify the culprits consuming the most CPU power.

You can find complete List of top command here

[root@TechArticles:~]# top
top - 23:10:40 up 19:45,  0 users,  load average: 3.88, 1.96, 0.77
Tasks:  49 total,   5 running,  44 sleeping,   0 stopped,   0 zombie
%Cpu(s): 49.9 us,  0.2 sy,  0.0 ni, 49.3 id,  0.0 wa,  0.0 hi,  0.6 si,  0.0 st
MiB Mem :  25177.0 total,  23979.4 free,    228.1 used,    969.5 buff/cache
MiB Swap:   7168.0 total,   7168.0 free,      0.0 used.  24230.5 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 237837 jay       20   0    6020   3208   1364 R 100.0   0.0   3:09.29 bash
 237838 jay       20   0    6020   3208   1364 R 100.0   0.0   3:09.29 bash
 237839 jay       20   0    6020   3208   1364 R 100.0   0.0   3:09.29 bash
 237840 jay       20   0    6020   3208   1364 R 100.0   0.0   3:09.29 bash
      1 root      20   0  167996  12156   9604 S   0.3   0.0   2:00.19 systemd
     21 root      20   0  439208 266272 264812 S   0.3   1.0   2:29.85 systemd-journal
    398 dbus      20   0    4856   2836   2548 S   0.3   0.0   0:21.79 dbus-broker
     30 root      20   0  173356  24428  18204 S   0.0   0.1   0:05.82 php-fpm

Press “P” or “Shift + P” to sort the processes by CPU utilization, displaying the highest usage at the top.

In this example, you can observe that the user jay is running a bash command, which is consuming 100% of a single CPU core. However, the overall CPU utilization shows %Cpu(s): 49.9 and the load average is 3.88.

Having pinpointed this as a cause of high CPU utilization in Linux, you can conclude your investigation. Inform the customer that the elevated CPU usage is due to the user jay executing a resource-intensive bash command, which is causing the current CPU load on the server.

To continue more troubleshooting, please follow the below steps:

  • Compare Load Average with CPU Core Count
    If the load average exceeds the total number of physical CPU cores available on the server and the CPU utilization is close to 100%, it is clear that the server is under significant load. This is a strong indicator of high CPU utilization in Linux that requires further investigation.
  • Check the Current Load Average
    Run the uptime or top command to check the system’s load average. The load average indicates the number of processes waiting for CPU time over 1, 5, and 15 minutes.
[root@TechArticles:~]# top
top - 23:30:09 up 20:04,  0 users,  load average: 8.39, 7.52, 4.91
Tasks:  66 total,   9 running,  57 sleeping,   0 stopped,   0 zombie
%Cpu(s): 99.5 us,  0.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.2 si,  0.0 st
MiB Mem :  25177.0 total,  23456.8 free,    632.6 used,   1087.6 buff/cache
MiB Swap:   7168.0 total,   7168.0 free,      0.0 used.  23797.8 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 237839 jay       20   0    6020   3208   1364 R  99.7   0.0  22:35.74 bash
 238864 jay       20   0    6020   1844      0 R  99.3   0.0   7:55.92 bash
 237837 jay       20   0    6020   3208   1364 R  99.0   0.0  22:35.11 bash
 237840 jay       20   0    6020   3208   1364 R  99.0   0.0  22:35.43 bash
 238862 jay       20   0    6020   1844      0 R  99.0   0.0   7:55.69 bash
 238863 jay       20   0    6020   1844      0 R  99.0   0.0   7:54.89 bash
 237838 jay       20   0    6020   3208   1364 R  98.3   0.0  22:35.12 bash
 238865 jay       20   0    6020   3208   1364 R  97.4   0.0   7:55.44 bash
 239293 mysql     20   0 2434840 413420  35944 S   1.0   1.6   0:04.67 mysqld
      1 root      20   0  168152  12240   9604 S   0.7   0.0   2:02.34 systemd
     21 root      20   0  447400 273640 272168 S   0.3   1.1   2:33.38 systemd-journal
    398 dbus      20   0    4980   2852   2548 S   0.3   0.0   0:22.06 dbus-broker
 239803 root      20   0    7872   3784   3184 R   0.3   0.0   0:00.03 top

According to the top command output:

  • The current load average is 8.39.
  • The %Cpu(s) value is 99.5, indicating that the CPU is almost fully utilized.

To determine if this load is significant, you need to check the number of physical CPU cores on the system.

[root@TechArticles:~]# nproc
8
[root@TechArticles:~]#

or

lscpu | grep '^CPU(s):'
cat /proc/cpuinfo | grep 'processor' | wc -l

About nproc: The nproc command is a Linux/Unix utility that is used to display the number of processing units available on the system. This can include physical CPUs, cores, and/or hyperthreads. The command simply prints the number of processing units to standard output and exits.

If the load average (8.39) exceeds the number of physical cores and the CPU utilization is near 100%, it confirms that the server is experiencing high CPU utilization in Linux and might require immediate attention.

Understanding the physical core count helps in assessing whether the current load is normal for the server’s capacity or if there is an underlying issue causing the excessive CPU usage.

Would you recommend upgrading the CPU at this point? Not so fast—it’s too early to suggest increasing CPU capacity just yet.

Before jumping to conclusions, let’s dig deeper into the issue. We’ll use the SAR command to analyze historical CPU utilization and identify any patterns or anomalies over time.

Learn more about the SAR command and its various uses here.

By examining historical data, we can make a more informed assessment before advising any hardware upgrades.

[root@TechArticles:~]# sar -u -1
Linux 4.18.0-372.9.1.el8.x86_64 (TechArticles)  03/19/2023   _x86_64_        (8 CPU)
09:03:48        CPU     %user     %nice   %system   %iowait    %steal     %idle
09:10:00        all      0.16      0.00      0.18      0.02      0.00     99.64
09:20:03        all      0.09      0.00      0.12      0.06      0.00     99.73
09:30:02        all      0.11      0.00      0.14      0.15      0.00     99.60
09:40:03        all      0.07      0.00      0.10      0.02      0.00     99.80
09:50:00        all      0.08      0.10      0.14      0.02      0.00     99.67
10:00:03        all      0.08      0.00      0.10      0.01      0.00     99.80
10:10:04        all      0.08      0.00      0.11      0.01      0.00     99.80
10:20:02        all      0.09      0.85      2.37      0.01      0.00     96.68
10:30:04        all      0.09      0.03      0.17      0.01      0.00     99.71
10:40:01        all      0.08      0.00      0.11      0.01      0.00     99.80
10:50:04        all      0.09      0.00      0.12      0.01      0.00     99.79
11:00:01        all      0.10      0.00      0.12      0.01      0.00     99.77
11:10:03        all      0.10      0.00      0.13      0.01      0.00     99.77
11:20:00        all      0.10      0.00      0.12      0.01      0.00     99.77
11:30:03        all      0.12      0.00      0.15      0.01      0.00     99.72
[...]
20:30:03        CPU     %user     %nice   %system   %iowait    %steal     %idle
20:40:01        all      0.11      0.00      0.15      0.01      0.00     99.73
20:50:04        all      0.13      0.00      0.16      0.01      0.00     99.70
21:00:01        all      0.12      0.00      0.15      0.01      0.00     99.72
21:10:03        all      0.15      0.00      0.18      0.01      0.00     99.66
22:07:30        all      0.15      0.00      0.20      0.19      0.00     99.46
22:10:00        all      0.17      0.00      0.20      0.01      0.00     99.61
22:20:01        all      0.08      0.00      0.11      0.04      0.00     99.77
22:30:01        all      0.08      0.00      0.10      0.01      0.00     99.81
22:40:03        all      0.07      0.00      0.10      0.00      0.00     99.82
22:50:00        all      0.07      0.01      0.11      0.00      0.00     99.81
23:00:03        all      0.07      0.00      0.11      0.02      0.00     99.80
23:10:00        all     12.63      0.00      0.29      0.00      0.00     87.08
23:20:03        all     49.76      0.00      0.90      0.02      0.00     49.32
23:30:01        all     88.77      0.00      0.69      0.00      0.00     10.54
23:40:04        all     89.52      0.00      0.45      0.01      0.00     10.03
23:50:01        all     62.54      0.00      0.45      0.03      0.00     36.98
Average:        all      4.21      0.01      0.20      0.01      0.00     95.56

From the SAR report, we can observe that the idle CPU dropped below 89% at 23:00, likely due to a user running a resource-intensive bash script during that time.

To gain deeper insights, let’s analyze additional historical data to determine if high CPU utilization in Linux consistently occurs around this time or if there are other time periods when the CPU usage spikes.

By identifying patterns in CPU usage, we can pinpoint the root cause and take appropriate actions to resolve the issue effectively.

[root@TechArticles:~]# sar -u -3
Linux 4.18.0-372.9.1.el8.x86_64 (TechArticles)  03/18/2023   _x86_64_        (8 CPU)
09:03:48        CPU     %user     %nice   %system   %iowait    %steal     %idle
09:10:00        all      0.16      0.00      0.18      0.02      0.00     99.64
09:20:03        all      0.09      0.00      0.12      0.06      0.00     99.73
09:30:02        all      0.11      0.00      0.14      0.15      0.00     99.60
09:40:03        all      0.07      0.00      0.10      0.02      0.00     99.80
09:50:00        all      0.08      0.10      0.14      0.02      0.00     99.67
10:00:03        all      0.08      0.00      0.10      0.01      0.00     99.80
10:10:04        all      0.08      0.00      0.11      0.01      0.00     99.80
10:20:02        all      0.09      0.85      2.37      0.01      0.00     96.68
10:30:04        all      0.09      0.03      0.17      0.01      0.00     99.71
[...]
13:40:00        CPU     %user     %nice   %system   %iowait    %steal     %idle
13:50:01        all      0.14      0.00      0.19      0.01      0.00     99.67
14:00:00        all      0.18      0.00      0.21      0.00      0.00     99.60
14:10:01        all      0.29      0.01      0.32      0.02      0.00     99.36
14:20:01        all      0.23      0.00      0.37      0.02      0.00     99.37
14:30:01        all      0.24      0.00      0.35      0.01      0.00     99.41
14:40:01        all      0.32      0.00      0.58      0.01      0.00     99.10
14:50:00        all      0.34      0.00      0.78      0.01      0.00     98.87
15:00:02        all      0.36      0.00      0.65      0.01      0.00     98.98
22:13:00        all      0.56      0.00      1.60      0.11      0.00     97.74
22:20:00        all      0.10      0.00      0.12      0.02      0.00     99.76
22:30:00        all      0.05      0.00      0.08      0.02      0.00     99.85
22:40:02        all      0.05      0.00      0.07      0.01      0.00     99.87
22:50:00        all      0.04      0.00      0.06      0.00      0.00     99.90
23:00:03        all      0.04      0.01      0.07      0.00      0.00     99.88
23:10:00        all      0.05      0.00      0.05      0.00      0.00     99.90
23:20:03        all      0.04      0.00      0.04      0.00      0.00     99.92
23:30:00        all      0.05      0.00      0.04      0.01      0.00     99.90
23:40:02        all      0.04      0.00      0.03      0.05      0.00     99.88
Average:        all      0.12      0.02      0.24      0.01      0.00     99.61

Let’s adjust the command so that the report is generated only when the CPU utilization drops below a specific threshold.

[root@TechArticles:~]# sar -u -1 | egrep -v "Average" | awk 'NR==3||$8<95'

Linux 4.18.0-372.9.1.el8.x86_64 (TechArticles)  03/18/2023   _x86_64_        (8 CPU)

09:03:48        CPU     %user     %nice   %system   %iowait    %steal     %idle
23:10:00        all     12.63      0.00      0.29      0.00      0.00     87.08
23:20:03        all     49.76      0.00      0.90      0.02      0.00     49.32
23:30:01        all     88.77      0.00      0.69      0.00      0.00     10.54
23:40:04        all     89.52      0.00      0.45      0.01      0.00     10.03
23:50:01        all     62.54      0.00      0.45      0.03      0.00     36.98

I reviewed additional historical data but couldn’t find any logs indicating CPU utilization dropping below 87%. Therefore, at this stage, we recommend that the customer check their script, as we didn’t find any other instances of high CPU utilization except for today.



What should we do if we find high CPU utilization in Linux from historical data?

If we find high utilization logs on historical data, we can also troubleshoot further to find the reason behind this.

To troubleshoot the reason for high CPU utilization, I am going to use the recap tool for this tutorial, and the recap tool should be already installed and configured to capture logs.

Make sure recap tool is installed and its configure to capture the resource utilization.

recap tool: recap is a system status reporting tool. A reporting script that generates reports of various information about the server.

Installation in RHEL/CentOS

recap is available from the EPEL repository.

# yum install recap
# recap -V
2.1.0

If the above tool is installed and enabled to capture historical data, you can easily find the reason for High CPU utilization in Linux.

Let’s look at the types of data that are available on recap. By default, recap maintains its settings in /etc/recap.conf file and logs in the /var/log/recap directory. The recap can be customised to meet your needs.

[root@TechArticles:/var/log/recap]# ls -ltr
total 100
drwxr-xr-x 2 root root 4096 Sep 21 20:22 snapshots
drwxr-xr-x 2 root root 4096 Sep 21 20:22 backups
-rw-r--r-- 1 root root 7262 Mar 20 00:54 ps_20230320-005439.log
-rw-r--r-- 1 root root 7094 Mar 20 00:54 resources_20230320-005439.log
-rw-r--r-- 1 root root 6034 Mar 20 00:54 netstat_20230320-005439.log
-rw-r--r-- 1 root root 8231 Mar 20 15:51 recap.log
[root@TechArticles:/var/log/recap]# 

As per the above details, recap capture the logs of ps, running resources, and logs of netstat.

To identify the cause, look for several times and dates in the ps and resource logs. Several reports, including the “Top 10 cpu utilizing processes,” will be displayed.

You will be able to offer suggestions and solutions to resolve the High CPU utilization in Linux based on all the current logs and history logs.

Please Note: To capture the historical logs, there are many tools on the market, both free and paid. The GNU General Public License, version 2.0, governs the recap tool. It is totally free.



There are a variety of causes for the High CPU utilization in Linux. Let’s examine a few more issues.

(a) Since the backup team takes heavy backups, you typically encounter these scenarios on weekends or outside of business hours.

(b) Use # top to determine which processes are using the most CPU time, then take a snapshot of those processes. Send the snapshot and let the user know to end the unnecessary process.

(c) If those processes are backups, alert the backup team and ask them to reduce CPU usage by stopping some backups or changing the backup priority to lower.

(d) On occasion, High CPU utilization in Linux will peak during peak hours (defined as times when businesses are open for business) and then return to normal after some time (within seconds or some minutes). but a ticket that the monitoring team raised. Therefore, we must take a picture of that peak stage, add it to the raised ticket, and then close that ticket.

(e) If there are any spare processors or other low-load CPUs available, heavy application processes should occasionally be transferred to those CPUs if they are running continuously (i.e., business applications). 

(f) If additional CPUs are not available, inform the data centre staff or CPU vendor to request the purchase of a new CPU with business approval and transfer some processes to the new CPUs.

While working in the real world, a wide variety of problems may arise. I’m hoping this article will help you troubleshoot issues with High CPU utilization in Linux.

Was this article of use to you? Post your insightful thoughts or recommendations in the comments section if you don’t find this article to be helpful or if you see any outdated information, a problem, or a typo to help this article better.

Related Posts


About Author

Photo of author

Jay

I specialize in web development, hosting solutions, and technical support, offering a unique blend of expertise in crafting websites, troubleshooting complex server issues, and optimizing web performance. With a passion for empowering businesses and individuals online, I provide in-depth reviews, tech tutorials, and practical guides to simplify the digital landscape. My goal is to deliver clear, reliable, and insightful content that helps readers make informed decisions and enhance their online presence.

Leave a Comment