Description:
Message Signaled Interrupt (MSI), as described in the PCI Local Bus
Specification Revision 2.3 or latest, is an optional feature, and a
required feature for PCI Express devices.
It is seen by Linux-kernel with large interrupt numbers (e.g. 8409), and this makes /proc/stat intr line insanely large.
This crashes the SIGAR library:
(gdb) run
Starting program: /opt/mysql/enterprise/agent/lib/mysql-service-agent/mysql-serv
ice-agent -f /opt/mysql/enterprise/agent/etc/mysql-service-agent.ini -D
warning: Lowest section in system-supplied DSO at 0xffffe000 is .hash at ffffe0b
4
[Thread debugging using libthread_db enabled]
[New Thread 4157900480 (LWP 23383)]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 4157900480 (LWP 23383)]
0x080e33d6 in sigar_skip_token (p=0x0) at agent/src/sigar/src/sigar_util.c:59
59 agent/src/sigar/src/sigar_util.c: No such file or directory.
in agent/src/sigar/src/sigar_util.c
(gdb) bt full
#0 0x080e33d6 in sigar_skip_token (p=0x0)
at agent/src/sigar/src/sigar_util.c:59
No locals.
#1 0x080e3561 in sigar_os_open (sigar=0x81fb2e4)
at agent/src/sigar/src/os/linux/linux_sigar.c:143
buffer = "cpu 15576080 1863687 5219297 312565295 22964945 55369 328215 0\ncpu0 9157911 921495 2612893 154977560 11422172 28476 165786 0\ncpu1 6418168 942192 2606403 157587735 11542773 26892 162429 0\nintr 2235256"...
ptr = 0x0
i = 2687360
status = 0
sb = {st_dev = 0, __pad1 = 0, st_ino = 0, st_mode = 0, st_nlink = 0,
st_uid = 0, st_gid = 0, st_rdev = 0, __pad2 = 0, st_size = 0,
st_blksize = 0, st_blocks = 0, st_atim = {tv_sec = 0, tv_nsec = 0},
st_mtim = {tv_sec = 0, tv_nsec = 0}, st_ctim = {tv_sec = 0, tv_nsec = 0},
__unused4 = 0, __unused5 = 0}
#2 0x080dfd20 in sigar_open (sigar=0x81fb200)
at agent/src/sigar/src/sigar.c:35
status = 2687360
#3 0x080677d3 in main_cmdline (argc=1, argv=0xffaf9c64)
at agent/src/merlin-agent.c:1865
print_uuid = 0
print_version = 0
---Type <return> to continue, or q <return> to quit---
option_ctx = (GOptionContext *) 0x81fb2d0
gerr = (GError *) 0x0
config_file = (
gchar *) 0x81fbc50 "/opt/mysql/enterprise/agent/etc/mysql-service-agent.ini"
g = (global *) 0x81fb2d0
ret = 0
no_daemon = 1
entries = {{long_name = 0x80e8ecb "defaults-file",
short_name = 102 'f', flags = 0, arg = G_OPTION_ARG_FILENAME,
arg_data = 0xffaf9b1c,
description = 0x80e95e8 "location of the mysql-service-agent.ini",
arg_description = 0x80e8ed9 "<filename>"}, {
long_name = 0x80f9b84 "version", short_name = 86 'V', flags = 0,
arg = G_OPTION_ARG_NONE, arg_data = 0xffaf9b18,
description = 0x80e8ee4 "print version and exit", arg_description = 0x0}, {
long_name = 0x80e8efb "print-uuid", short_name = 117 'u', flags = 0,
arg = G_OPTION_ARG_NONE, arg_data = 0xffaf9b14,
description = 0x80e9610 "print the uuid entry for appending it to the configfile", arg_description = 0x0}, {long_name = 0x80e8f06 "no-daemon",
short_name = 68 'D', flags = 0, arg = G_OPTION_ARG_NONE,
arg_data = 0xffaf9b10,
description = 0x80e9648 "run agent in foreground for debugging",
---Type <return> to continue, or q <return> to quit---
arg_description = 0x0}, {long_name = 0x0, short_name = 0 '\0', flags = 0,
arg = G_OPTION_ARG_NONE, arg_data = 0x0, description = 0x0,
arg_description = 0x0}}
#4 0x08067930 in main (argc=4, argv=0xffaf9c64)
at agent/src/merlin-agent.c:1932
check_str = (const gchar *) 0xf7d48690 "\200\001)"
How to repeat:
get a pci express card which would have high interrupt number, then observe such /proc/stat:
cpu 15796624 1863687 5256913 313572511 23841171 55943 337632 0
cpu0 9269901 921495 2632803 155473539 11864899 28769 170525 0
cpu1 6526722 942192 2624109 158098972 11976271 27174 167107 0
intr 2253301706 1817966488 3481 0 13 11 0 5 0 0 0 0 0 4 0 63408846 0 58221846 12724222 0 0 [thousands of 0s] 0 0 0 0 0 300976790 0 0 0 0 0 0
ctxt 605988946
btime 1192808591
processes 414102
procs_running 1
procs_blocked 1
And the device would be seen in /proc/interrupts as:
8409: 167154295 167134445 PCI-MSI-edge eth0
Suggested fix:
handle large buffers