Thursday 12 March 2020

Troubleshooting High CPU Utilization for Routers & Switches

Lets discuss how to troubleshoot High CPU utilization for Routers and Switches


For troubleshooting the issue we must try to find which all  processes  running on the device are causing the CPU utilization .

TROUBLESHOOTING HIGH CPU UTILIZATION 


A very good commands to use is  "show cpu process sorted" which shows you how busy is the CPU for last 5 secs , 1 min and 5 mins .

The commands also shows the CPU utilization each process has consumed over these period of time .

R1#show processes cpu sorted
CPU utilization for five seconds: 1%/0%; one minute: 1%; five minutes: 1%
 PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min TTY Process
 103        6000      3322       1806  0.73%  0.62%  0.62%   0 encrypt proc
  38        1692       264       6409  0.16%  0.14%  0.16%   0 Compute load avg
   2           4       263         15  0.00%  0.00%  0.00%   0 Load Meter
   3          76        45       1688  0.00%  0.00%  0.00%   0 CEF Scanner
   4           0         1          0  0.00%  0.00%  0.00%   0 EDDRI_MAIN
 < output omitted>

Another useful command is "show cpu process history"

This command shows the graphical representation of the cpu utilized  in last 60 seconds , 60 minutes and 72 hours which helps us to analyse the cpu utilization time ,

R1#show processes cpu history

R1   10:00:03 AM Tuesday Jul 2 2019 UTC



    2     1111111111111111111111111     111111111111111     1111
100
 90
 80
 70
 60
 50
 40
 30
 20
 10
   0....5....1....1....2....2....3....3....4....4....5....5....6
             0    5    0    5    0    5    0    5    0    5    0
               CPU% per second (last 60 seconds)


                                    9
    23211121212222122212112114511 341
100
 90                                 *
 80                                 *
 70                                 *
 60                                 *
 50                                 *
 40                                 *
 30                                 *
 20                                 *
 10                           *     #
   0....5....1....1....2....2....3....3....4....4....5....5....6
             0    5    0    5    0    5    0    5    0    5    0
               CPU% per minute (last 60 minutes)
              * = maximum CPU%   # = average CPU%




100
 90
 80
 70
 60
 50
 40
 30
 20
 10
   0....5....1....1....2....2....3....3....4....4....5....5....6....6....7..
             0    5    0    5    0    5    0    5    0    5    0    5    0
                   CPU% per hour (last 72 hours)
                  * = maximum CPU%   # = average CPU%


There are multiple reasons why the cpu utilization would go high 

1) Deubgging is On can cause CPU utilization go high .

undebug all 

is the command to stop debugging on Router and Switches

2) ARP Input Processes -Router may originate an excessive number of ARP requests.

3) Interface/s flapping - an continuous flapping interface causing continuous changes in the Routing Table can cause excessive CPU Utilization.

4) TCP Process  -Large number of TCP session established on the Router can also be a reason for high CPU Utilization.

5) BGP Scanner - High CPU due to the BGP scanner process can be expected for short durations on a router carrying a large Internet routing table. Once a minute, BGP scanner walks the BGP RIB table and performs important maintenance tasks. These tasks include checking the next-hop referenced in the router's BGP table and verifying that the next-hop devices can be reached. Thus, a large BGP table takes an equivalently large amount of time to be walked and validated.

Because the BGP Scanner process runs through the entire BGP table, the duration of the high CPU condition varies with the number of neighbors and the number of routes learned per neighbor. Use the show ip bgp summary and show ip route summary commands to capture this information.

6) Exec & Virtual Exec Process-The Exec process is responsible for communication on the tty lines (console, auxiliary, asynchronous) of the router.

The Virtual Exec process is responsible for the vty lines (telnet sessions).

R1#show process | i CPU|Exec
CPU utilization for five seconds: 1%/0%; one minute: 1%; five minutes: 1%
  31 M*         0         2272        651    3490 9728/12000  0 Exec
R1#


The CPU utilization for the Exec process increases if there are lot of data transferred through these sessions

For the console (Exec), the router uses one interrupt per characte

The console interrupt can be seen in the show stacks command output:

R1#show stacks
Minimum process stacks:
 Free/Size   Name
 5556/6000   Clock Update Proc
 5636/6000   Inspect Init Msg
 2612/3000   allegro libretto init
 3348/12000  Init
59416/60000  script background loader
 5448/6000   RADIUS INITCONFIG
 2536/3000   Rom Random Update Process

Interrupt level stacks:
Level    Called Unused/Size  Name
  3           0   9000/9000  PA Management Int Handler
  4       57475   6912/9000  Network interfaces
  5           0   9000/9000  Timebase Reference Interrupt
  6        4355   8896/9000  16552 Con/Aux Interrupt ===>check for unused 
  7     1925809   8896/9000  MPC860T TIMER INTERRUPT


  • Disable console logging on the router with (no logging console ).
  • Verify if a long output is printed on the console (for eg., a show tech-support or a show memory ).
  • The exec command is configured for asynchronous and auxiliary lines. If a line has only outgoing traffic, the Exec process should be disabled for this line, because if the device (for example, a modem) attached to this line sends some unsolicited data, the Exec process starts on this line.If the router is used as terminal-server (for reverse telnet to other device consoles), it is recommended that you configure no exec on the lines that are connected to the console of the other devices.Data that comes back from the console might otherwise start an EXEC process, which uses CPU resources.



For the vty line (Virtual Exec), the telnet session has to build a TCP packet and send the character(s) to the telnet client.
     
       If huge amount of data is transfer through the vty ports CPU utilization may go high

The commands to verify the amount of data transferred is 

"show tcp vty 0"



Commands used to collect more information

show processes cpu Command
show interfaces Command
show interfaces switching Command
show interfaces stat Command
show ip nat translations
show align Command
show version Command
show log Command

1 comment:

  1. A very helpful document which can take the beginner to a next level. We look forward for you to post more and more :)

    ReplyDelete