crashinfo is a tool for gathering information directly on customer's host and to do some simple automated 1st-pass analysis. Especially is it useful for hangs or performance problems.
By default, when crashinfo is invoked without any options, it prints some vital statistics and runs a number of tests. As a result of running these tests, it can print WARNING messages. These messages are usually (but not always) followed by more details. For example:
+++WARNING+++ fragmentation: 32Kb == Zone:Normal == AREA SIZE FREE_AREA_STRUCT BLOCKS PAGES ---- ---- ------------------ ------ ----- 0 4k 0x00000000c03ae0d8 8071 8071 1 8k 0x00000000c03ae0e4 1584 3168 2 16k 0x00000000c03ae0f0 157 628 3 32k 0x00000000c03ae0fc 0 0 4 64k 0x00000000c03ae108 0 0 5 128k 0x00000000c03ae114 0 0 6 256k 0x00000000c03ae120 0 0 7 512k 0x00000000c03ae12c 0 0 8 1024k 0x00000000c03ae138 0 0 9 2048k 0x00000000c03ae144 0 0 10 4096k 0x00000000c03ae150 0 0 +++WARNING+++ there are outstanding blk_dev requests <struct request 0xf7fb9900>, rq_status=0x1, rq_dev=0x6802 major=104 minor=2, started 23.4 s ago
You can search for +++WARNING+++ pattern to be sure you did not miss any warning.
Some tests rely on builtin crash commands that can potentially run for a long time. For example, to check for fragmentation we do 'kmem -f'. On one of vmcores it took more than 20 minutes to complete, even though the host was fast (DL145). As in many cases we would like to obtain at least partial results fast, there is a timeout mechanism for such commands. By default it is 2 min/command. If we timeout, a warning message is printed:
+++WARNING+++ <foreach bt> failed to complete within the timeout period of 120s
In this case you can increase the timeout by using an option '--timeout=seconds'. If you are in a hurry, you can use '--fast' option. This sets timeout to 15s and disables some potentially slow tests.
The tests check many conditions that are important when system seems to be hung: load averages, memory fragmentation, outstanding block requests, spinlocks hold, activity on networking cards and so on. Some tests are heuristic - see advanced options for more details. New tests are added all the time, based on practical experience.
You can decrease verbosity by using '-q' (quiet) option, in this case only warnings are printed. Using '-v' (verbose) makes crashinfo print additonal information.
In additon to default mode, crashinfo supports a number of options that facilitate the troubleshooting. You can see them all by running
crash64> xportshow -h Usage: xportshow.py [options] Options: -h, --help show this help message and exit -a print all sockets -v verbose output -r Print routing table. Adding -v prints all routing tables and policies --program=PROGRAM print sockets for cmdname --pid=PID print sockets for PID --netfilter Print Netfilter Hooks --softnet Print Softnet Queues --summary Print A Summary -s, --statistics Print Statistics -i Print Interface Info --interface=IF1 Limit output to the specified interface only --decode=DECODE Decode iph/th/uh --port=PORT Limit output to the specified port (src or dst) -l, --listening Print LISTEN sockets only -t, --tcp Print TCP Info -u, --udp Print UDP Info -w, --raw Print RAW Info -x, --unix Print UNIX Info --sysctl Print sysctl info for net. --devpack Print dev_pack info --arp Print ARP & Neighbouring info --rtcache Print the routing cache --skbuffhead=SKBUFFHEAD Print sk_buff_head --version Print program version and exit --everything Run all functions available for regression testing Experimental Options: Caution: this is work in progress, not fully supported for all kernels yet. --sport=SPORT Limit output to the specified sport --dport=DPORT Limit output to the specified dport --ipsec Print IPSEC stuff --profile Run with profiler ** Execution took 2.18s (real) 1.93s (CPU)
Quite often we need to better understand what kind of load do we have: what are the name of commands, how many threads are running and what they are doing. A traditional approach is to use 'foreach bt' and then browse it or write AWK-scripts to reformat the output. Two options let us somewhat facilitate the process:
--stacksummary Print stacks (bt) categorized summary. --findstacks=FINDSTACKS Print stacks (bt) containing functions that match the provided pattern
In both cases the stacks are printed in a compact way, skipping some details.
'--stacksummary' prints stacks that look similar grouped together. When deciding whether two stacks look similar, we do not take into account the offsets, only the names and order of functions. The stacks are reverse-sorted by how many times they occur, so that those stacks that occur many times are output first. For example:
------- 74 stacks like that: ---------- #0 schedule #1 schedule_timeout #2 futex_wait #3 do_futex #4 sys_futex #5 system_call youngest=23s(pid=4482), oldest=82577s(pid=3539) ........................ hpasmd 3 times hpsmhd 25 times httpd 10 times java 34 times twagent 2 times ------- 41 stacks like that: ---------- #0 schedule #1 sys_rt_sigsuspend #2 system_call youngest=23s(pid=17937), oldest=82550s(pid=4267) ........................ java 24 times opmn 6 times rwserver 9 times vcagentd 2 times
Adding '-v' prints lists of PIDs for each stack
'--findstacks' searches for those stacks that contain a given pattern (patterns can be regular expressions). For example:
crash32> crashinfo --find failed PID=36 CMD=qla2300_dpc0 #0 context_switch+0xa4 #1 schedule+0x2ef #2 __down_interruptible+0x85 #3 __down_failed_interruptible+0x2 #4 .text.lock.KBUILD_BASENAME+0xa2f09 , (via qla2x00_do_dpc) #5 kernel_thread_helper+0x3 PID=37 CMD=qla2300_dpc1 #0 context_switch+0xa4 #1 schedule+0x2ef #2 __down_interruptible+0x85 #3 __down_failed_interruptible+0x2 #4 .text.lock.KBUILD_BASENAME+0xa2f09 , (via qla2x00_do_dpc) #5 kernel_thread_helper+0x3
As 'foreach bt' command is very expensive, we cache the results for non-live kernels, so that if you want to do '--find' with several different patterns, the execution will be slow only the first time.
These are described by 'struct request' which is quite different on different kernels. In addition, the lists of these structures are constructed in very different ways. As a result, instead of trying to traverse these lists, we use a different approach:
This approach seems to work very nice on all kernels from 2.4.21 to 2.6.24. The output is different for different kernels, for example, on a live 2.6.24 kernel:
crash64> crashinfo --blk ---- Outstanding blk_dev Requests ----- <struct request 0xffff8100787b5000>, disk_name=sdb major=8, in_flight=17, cmd_flags=0x10445, ref_count=1, started 0.5 s ago <struct request 0xffff8100787b5120>, disk_name=sdb major=8, in_flight=17, cmd_flags=0x10445, ref_count=1, started 0.5 s ago <struct request 0xffff810057912000>, disk_name=sdb major=8, in_flight=17, cmd_flags=0x10445, ref_count=1, started 0.1 s ago
For 2.4.21:
---- Outstanding blk_dev Requests ----- <struct request 0xc8168f00>, rq_status=0x1, rq_dev=0x6802 major=104 minor=2, started 28386.3 s ago
We can print all globally-declared structures that are in use using the following approach:
For example:
crash64> crashinfo --lws -- semaphores with sleepers > 0 -- -- rw_semaphores with count > 0 -- -- Non-empty wait_queue_head -- log_wait PID: 6013 TASK: ffff81007d2ee7c0 CPU: 0 COMMAND: "dd" acpi_bus_event_queue PID: 5832 TASK: ffff81007d2367c0 CPU: 1 COMMAND: "acpid" vt_activate_queue PID: 6325 TASK: ffff81007b456f80 CPU: 1 COMMAND: "console-kit-dae" PID: 6152 TASK: ffff8100707e1740 CPU: 1 COMMAND: "console-kit-dae" PID: 6151 TASK: ffff8100707e0f80 CPU: 1 COMMAND: "console-kit-dae" PID: 6150 TASK: ffff8100707e07c0 CPU: 1 COMMAND: "console-kit-dae" serio_wait PID: 163 TASK: ffff81007be59740 CPU: 1 COMMAND: "kseriod"
Unfortunately, this approach does not produce all structures of interest as not all of them are declared as global variables (e.g. NFS maintains its own queues/semaphores in server and client structures). More subsystem-specific structures will be added in next releases
--filelock' option produces a list of file locks being hold, e.g.
crash64> crashinfo --filelock Active locks <struct file_lock 0xffff810037896678> FL_POSIX <struct file 0xffff8100529c9a80> <struct dentry 0xffff810054f2d1a0> pid=7246 inode=0xffff810051e83998 ------------------------------ <struct file_lock 0xffff810037896958> FL_POSIX F_RDLCK <struct file 0xffff810057b82840> <struct dentry 0xffff810054f275b0> pid=7246 inode=0xffff810051cc48e8 ------------------------------ Blocked list ...
'--decodesyscalls' lets you decode the arguments passed to system calls. As arguments originate from userspace, it is not always possible to decode them (as pages might be not present in vmcore). You can specify either PID or system call name. For example:
crash64> crashinfo --decode 1 PID=1 CMD=init #0 schedule+0x245 #1 schedule_timeout+0x95 #2 do_select+0x468 #3 core_sys_select+0x209 #4 __slab_free+0x1fe #5 release_task+0x284 #6 do_wait+0xa25 #7 sys_select+0x44 #8 system_call+0x7e, 477 bytes of data ....... Decoding Syscall Args ....... sys_select (7, (fd_set *) 0x7fff249af960, (fd_set *) 0x7fff249af8e0, (fd_set *) 0x7fff249af860, (struct timeval *) 0x0) nfds=7 readfds [3, 5, 6] writefds [] exceptfds [] No timeout crash32> crashinfo --decode nanosleep PID=1839 CMD=klogd #0 schedule+0x340 #1 schedule_timeout+0x60 #2 sys_nanosleep+0xce #3 system_call+0x68, 241 bytes of data ....... Decoding Syscall Args ....... sys_nanosleep ((struct timespec *) 0xbfff8e60, (struct timespec *) 0xbfff8e60) req 1sec, 0nsec rem 1sec, 0nsec PID=6405 CMD=java #0 schedule+0x340 #1 schedule_timeout+0x60 #2 sys_nanosleep+0xce #3 system_call+0x68, 241 bytes of data ....... Decoding Syscall Args ....... sys_nanosleep ((struct timespec *) 0x4cb76b60, (struct timespec *) 0x0) req 0sec, 49983000nsec rem NULL
Most of sysctl settings are mapped to variables, we can print them using
crash32> crashinfo --sysctl abi.defhandler_coff 117440515 abi.defhandler_elf 0 ... vm.pagecache [2, 10, 20] vm.pagetable_cache [25, 50] vm.skip_mapped_pages 0 vm.stack_defer_threshold 2048
Please note that some sysctl settings are really programmatic, so that the get/set value is computed rather than stored in a variable. We do not print such values.