Priorities, Nice Numbers and How to Use Them



 

 


Managing Tasks

Managing the running tasks in a Linux system can be an important job for the System Administrator. A typical problem is when a task may use up CPU time that is needed by other processes. I have seen this occur with various programs that somehow get stuck in some sort of internal loop and they just suck up all the CPU cycles. In other cases, application may be using its CPU cycles productively, but other tasks become sluggish and the system becomes unresponsive to end users. In either case, it may be necessary to adjust the priority of one or more running tasks.

Another option, of course, is to simply kill the offending process, but sometimes that is not an option. For example a task that runs for many hours or even a few days may be run once a year or once a quarter to perform some critical internal or regulatory function. In this case the process must run but simply play nice with the other processes running on the system.

Note: A task is a single process or program that consumes CPU cycles.

Priorities

Linux schedules each task for CPU time using an algorithm based on some basic factors, including its nice number. These factors are combined into a priority by the algorithm. These factors include the following for each process:

  • Length of time waiting for CPU time
  • Amount of CPU time recently consumed
  • Nice number

The algorithm, which is a part of the kernel scheduler, determines the priority of each process running in the system. Programs or processes with higher priorities are more likely to be allocated CPU time. Priorities are very dynamic and can change rapidly based on the three factors listed above.

Linux process priorities run from 0 through 39 with 39 being the lowest priority and 0 the highest. This seems to be reversed from common logic, but you should consider that higher numbers mean a “nicer” priority.

There is also an RT, or RealTime priority which is used by some processes that need to get CPU time immediately when some event occurs. This might be a process that handles hardware interrupts for the kernel. In order to ensure that data is not lost as it arrives from a disk drive or network interface, for example, a high priority process is used to empty the data buffer when it becomes full and store the data in some specific memory location where it can be accessed as needed. Meanwhile, the empty input buffer can be use to store more incoming data from the device.

Nice Numbers

Nice numbers are the mechanism used by administrators to affect the priority of a process. It is not possible to change the priority of a process directly, but changing the nice number can modify the results of the kernel scheduler’s priority setting algorithm.  Nice numbers run from -20 to +19 where higher numbers are nicer.

The default nice number is 0 and the default priority is 20. Setting the nice number higher than zero increases the priority number somewhat, thus making the process nicer and therefore less greedy of CPU cycles. Setting the nice number to a negative number results in a lower priority number making the process less nice.

Finding the Process Taking all the CPU Cycles

The top program can be used to help you locate one or more processes that are taking so many CPU cycles that other processes suffer because they are starved for CPU time.

This sample output from top shows that the process with PID 6386 is taking over 98% of the CPU time; fortunately this server has multiple CPU cores so it is not really a problem, but it does make for a good example. The process in question is a BASH shell and I simply set up a BASH program to perform simple math operations in an infinite loop for this example.

The default sort field for the top program is %CPU, which is the percent of CPU time being used by the processes currently running on the system. In this case, the culprit is fairly obvious as it is the first line of top that lists the individual processes and their CPU usage; task 6386 is taking up 98.8% of CPU time. Note that this represents CPU usage of a single CPU. It is possible for a multi-threaded process to take more than 100% of CPU time in a multi-CPU computer.

top - 10:50:21 up 18:55,  5 users,  load average: 1.11, 0.97, 0.48
Tasks: 268 total,   2 running, 266 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  1.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  4.9%us,  1.0%sy,  0.0%ni, 94.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  1.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  : 86.3%us, 13.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   6121932k total,  3635788k used,  2486144k free,   384320k buffers
Swap:  8191996k total,        0k used,  8191996k free,  2582328k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                       
 6386 root      20   0  6456 1696 1380 R 98.8  0.0   0:15.57 bash                                                                                          
 6332 dboth     20   0  142m  26m  15m S 19.8  0.4   0:08.88 konsole                                                                                       
 6385 root      20   0  6912 1496  776 S 11.9  0.0   0:02.67 screen                                                                                        
   16 root      20   0     0    0    0 S  1.0  0.0   0:00.44 events/1                                                                                      
 2417 root      20   0 78796  33m 6080 S  1.0  0.6  10:28.25 Xorg                                                                                          
10109 root      20   0  2776 1168  824 R  1.0  0.0   0:00.08 top                                                                                           
    1 root      20   0  2932 1296 1084 S  0.0  0.0   0:01.08 init

The priority column (PR) for PID 6386 shows a priority of 20 and the nice number column (NI) shows 0. These are middle-of-the-road, neutral numbers for both. The bold highlights in these examples of top output are mine simply to draw your attention to the relevant information.

For more information on the top program.

Using top to Renice a Process

Running processes can be reniced to enable you as the administrator to make them play nicer with other programs. Let’s renice the offending process, PID 6386.

Renice a process using top by pressing the “r” key. The top program then asks you which PID (Process ID) you want to renice. In this example the PID is 6386 so type that in and press Enter as shown in the example below.

top - 11:22:17 up 19:27,  5 users,  load average: 0.83, 0.45, 0.60
Tasks: 263 total,   2 running, 261 sleeping,   0 stopped,   0 zombie
Cpu0  :  3.9%us,  0.0%sy,  0.0%ni, 96.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 88.0%us, 12.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  : 54.4%us,  9.7%sy,  0.0%ni, 35.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  1.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   6121932k total,  3648560k used,  2473372k free,   385836k buffers
Swap:  8191996k total,        0k used,  8191996k free,  2582568k cached
PID to renice: 6386
 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                       
 6386 root      20   0  6456 1696 1380 R 100.0  0.0  22:54.62 bash                                                                                         
 6332 dboth     20   0  142m  26m  15m S 26.7  0.4   5:47.26 konsole                                                                                       
 6385 root      20   0  6912 1496  776 S 15.8  0.0   3:27.68 screen                                                                                        
 3220 dboth     20   0 35844 9432 7504 S  2.0  0.2  14:12.81 gkrellm  

Now top asks you what nice number to which you want to renice the process. In the example below, I have entered 20. After typing the desired nice number, press the Enter key again.

top - 11:22:17 up 19:27,  5 users,  load average: 0.83, 0.45, 0.60
Tasks: 263 total,   2 running, 261 sleeping,   0 stopped,   0 zombie
Cpu0  :  3.9%us,  0.0%sy,  0.0%ni, 96.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 88.0%us, 12.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  : 54.4%us,  9.7%sy,  0.0%ni, 35.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  1.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   6121932k total,  3648560k used,  2473372k free,   385836k buffers
Swap:  8191996k total,        0k used,  8191996k free,  2582568k cached
Renice PID 6386 to value: 20
 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                       
 6386 root      20   0  6456 1696 1380 R 100.0  0.0  22:54.62 bash                                                                                         
 6332 dboth     20   0  142m  26m  15m S 26.7  0.4   5:47.26 konsole                                                                                       
 6385 root      20   0  6912 1496  776 S 15.8  0.0   3:27.68 screen                                                                                        
 3220 dboth     20   0 35844 9432 7504 S  2.0  0.2  14:12.81 gkrellm

After changing the nice number, the priority for PID 6386 is now shown to be 39. Notice that in this example the actual amount of CPU time does not change much; that is due to the fact that there is still lots of CPU resources available in this quad-core system so there is no competition for the CPU. If there were significant competition for the CPU, then the amount of CPU time allocated to PID 6386 would be reduced because other processes would have a higher priority, i.e., a lower priority number.

You can see in the sample output from top, below the altered nice and priority numbers:

top - 11:29:19 up 19:34,  5 users,  load average: 1.13, 1.06, 0.85
Tasks: 263 total,   3 running, 260 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  9.8%sy, 90.2%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 28.2%us,  9.7%sy,  0.0%ni, 62.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  : 28.5%us,  0.7%sy,  0.0%ni, 70.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  4.9%us,  0.0%sy,  0.0%ni, 95.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   6121932k total,  3652752k used,  2469180k free,   386204k buffers
Swap:  8191996k total,        0k used,  8191996k free,  2582872k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                       
 6386 root      39  19  6456 1696 1380 R 99.7  0.0  29:54.74 bash                                                                                          
 6332 dboth     20   0  142m  26m  15m S 23.7  0.4   7:29.09 konsole                                                                                       
 6385 root      20   0  6912 1496  776 S 14.8  0.0   4:28.11 screen                                                                                        
 2417 root      20   0 78676  33m 6080 S  2.0  0.6  11:07.19 Xorg                                                                                          
 3220 dboth     20   0 36108 9668 7504 S  2.0  0.2  14:18.37 gkrellm                                                                                       
10964 root      20   0  2776 1164  824 R  2.0  0.0   0:00.40 top                                                                                           
 16 root      20   0     0    0    0 S  1.0  0.0   0:06.81 events/1                                                                                      
 18 root      20   0     0    0    0 S  1.0  0.0   0:04.79 events/3

The renice Command

The renice command can also be used to change the nice number of a process. But since you usually need to use the top command to determine which process is hogging all the CPU cycles, you might as well use top to make the changes to the nice number as well.

Note that it is rare to have to change the nice number of a process these days because most systems have a huge amount of CPU power available. It can still happen, however, so it is really good to know how to determine which process is causing the problem and how to fix it.