#27 segfault when displaying data

Database
open
nobody
None
5
2006-03-06
2006-03-06
Luis M
No

until a few days ago i was able to go to:

/nagios/cgi-bin/perfparse.cgi?all_bin=1&group_name=**ALL**

With no issues.

I started getting segfaults yesterday and the page gets
cut at a particular host. I purge all data from the db
in reference to the last host printed, and after that
things started working again.

Now (today), i got the same segfault about some other host.

Running perfparse.cgi from the command line i get the
following:

QUERY_STRING='all_bin=1&group_name=**ALL**'
/usr/lib/nagios/cgi/perfparse.cgi

[a bunch of HTML... good stuff]

<td bgcolor="#FFFFFF"><font face="Arial, Helvetica"
size=2>&nbsp;runtime=s&nbsp;</td>

<td bgcolor="#FFFFFF"><font face="Arial, Helvetica"
size=2>&nbsp;UP&nbsp;</td></tr>

Segmentation fault

strace gave me:

write(1, " <td bgcolor=\"#FFFFFF\"><font fa"..., 104
<td bgcolor="#FFFFFF"><font face="Arial, Helvetica"
size=2><nobr>&nbsp;2006-03-02 14:00:50&nbsp;</td>

) = 104

write(1, " <td bgcolor=\"#FFFFFF\" align=ri"..., 92
<td bgcolor="#FFFFFF" align=right><font face="Arial,
Helvetica" size=2>&nbsp;0&nbsp;</td>

) = 92

write(1, " <td bgcolor=\"#FFFFFF\"><font fa"..., 88
<td bgcolor="#FFFFFF"><font face="Arial, Helvetica"
size=2>&nbsp;runtime=s&nbsp;</td>

) = 88

write(1, " <td bgcolor=\"#FFFFFF\"><font fa"..., 81
<td bgcolor="#FFFFFF"><font face="Arial, Helvetica"
size=2>&nbsp;UP&nbsp;</td>

) = 81

write(1, "</tr>\n", 6</tr>

) = 6

--- SIGSEGV (Segmentation fault) @ 0 (0) ---

+++ killed by SIGSEGV +++

And finally, gdb gave me:

$> QUERY_STRING='all_bin=1&group_name=**ALL**' gdb
/usr/lib/nagios/cgi/perfparse.cgi

<td bgcolor="#FFFFFF"><font face="Arial, Helvetica"
size=2>&nbsp;UP&nbsp;</td></tr>

Program received signal SIGSEGV, Segmentation fault.

0x0017e045 in mysql_fetch_row () from
/usr/lib/mysql/libmysqlclient.so.14

(gdb) bt

#0 0x0017e045 in mysql_fetch_row () from
/usr/lib/mysql/libmysqlclient.so.14

#1 0x0804bab8 in displayAllBin () at cgi_bin_report.c:125

#2 0x0804a4c5 in main (argc=1, argv=0xbfd1ee14) at
perfgraph.c:179

(gdb)

I make clean and re-configure/re-compile all; then
re-install to make sure i'm not hitting a lemon
mis-match library somewhere. Same issue.

In cgi_bin_report.c I did the following changes and
that made things a bit better:

169

170 printf("%s\n","</tr>");

171 172 }

173

174 printf("%s\n","</table>");

175

176 177 }

Before that, the crash was in line 170 of that file.
I'm sure that shouldn't matter because the crash seems
to come from the mysql client library itself.

$> profile-computer

#==============================================================================#

# profile-computer 1.15 Luis Mondesi <lemsx1@gmail.com>

# http://lems.kiskeyix.org/toolbox/?f=profile-computer&d=1

#==============================================================================#

Host Name: venus.dev.americanhm.com

System Kernel: Linux venus.dev.americanhm.com
2.6.14-1.1653_FC4smp #1 SMP Tue Dec 13 21:46:01 EST
2005 i686 i686 i386 GNU/Linux

#==============================================================================#

CPU Info: Pentium III (Coppermine)

Total Processors: 2

Bogomips total: 3383

#==============================================================================#

Memory: 1034384 kB

Virtual Memory (swap): 2031608 kB

#==============================================================================#

Host bridge: Intel Corporation 440GX - 82443GX Host bridge

PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge

SCSI storage controller: Adaptec AIC-7896U2/7897U2

SCSI storage controller: Adaptec AIC-7896U2/7897U2

Ethernet controller: Intel Corporation 82557/8/9
[Ethernet Pro 100] (rev 08)

ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA
(rev 02)

IDE interface: Intel Corporation 82371AB/EB/MB PIIX4
IDE (rev 01)

USB Controller: Intel Corporation 82371AB/EB/MB PIIX4
USB (rev 01)

Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)

VGA compatible controller: Cirrus Logic GD 5480 (rev 23)

PCI bridge: Digital Equipment Corporation DECchip 21150
(rev 06)

PCI bridge: Texas Instruments PCI2031 (rev 01)

#==============================================================================#

LSB_VERSION: 1.3

#==============================================================================#

Library: libc6

Compiler Version: gcc (GCC) 4.0.2 20051125 (Red Hat
4.0.2-8)

#==============================================================================#

/proc/cmdline

ro root=/dev/VolGroup00/LogVol00

#==============================================================================#

Discussion

  • Luis M

    Luis M - 2006-03-06

    Logged In: YES
    user_id=239796

    I downgraded the mysql binaries/server and devel stuff for
    FC4 from the current RPM (from updates) 4.1.16 to 4.1.15 and
    then to 4.1.14. To NO avail...

    Oh, i made sure i recompiled perfparse.cgi everytime I did a
    downgrade.

    Here is the output of the perfparse-log2mysql --show_config

    Perfparse-log2* [options]

    # File where Perfparse logs messages
    # Error_Log = "string"
    Error_Log = "/var/log/nagios/perfparse.log"

    # Rotate Perfparse log files
    # Error_Log_Rotate = "Y/N"
    Error_Log_Rotate = "Yes"

    # When perfparse cannot parse a line, it drops it to that file
    # Drop_File = "string"
    Drop_File = "/tmp/perfparse.drop"

    #
    # Drop_File_Rotate = "Y/N"
    Drop_File_Rotate = "Yes"

    # Log source from nagios (or other tools) that perfparse
    will scan
    # Authorized values: a file name, '-' for stdin, '|' for a
    fifo and '>' for a host:port socket
    # For sockets, a command 'history' will be sent before
    retreiving the data
    # Service_Log = "string"
    Service_Log = "/var/log/nagios/serviceperf.log"

    # Save the read position in the nagios log file ? If yes,
    perfparse will start from that position instead of from the
    beginning
    # Service_Log_Save_Position = "Y/N"
    Service_Log_Save_Position = "No"

    # Path for files containing the read position for nagios log
    files
    # Service_Log_Position_Mark_Path = "string"
    Service_Log_Position_Mark_Path = "/var/tmp"

    # Start timestamp for history retreiving (positive is
    absolute, negative is relative to end tm)
    # History_Start_Tm = "value"
    History_Start_Tm = "-86400"

    # End timestamp for history retreiving (positive is
    absolute, negative is relative to Now)
    # History_End_Tm = "value"
    History_End_Tm = "-30"

    # Show status bar when running
    # Show_Status_Bar = "Y/N"
    Show_Status_Bar = "no"

    # Print a report at the end of the processing
    # Do_Report = "Y/N"
    Do_Report = "no"

    # Dummy hostname if gethostname() does not work
    # Dummy_Hostname = "string"
    Dummy_Hostname = "localhost"

    # Don't store raw data
    # No_Raw_Data = "Y/N"
    No_Raw_Data = "no"

    # Don't store bin data
    # No_Bin_Data = "Y/N"
    No_Bin_Data = "no"

    # Path where storage modules are
    # Storage_Modules_Dir = "string"
    Storage_Modules_Dir = "/usr/lib"

    # Modules to load (Coma separated values)
    # Storage_Modules_Load = "string"
    Storage_Modules_Load = "mysql"

    # Storage Module : mysql
    # ==============================

    # Database user
    # DB_User = "string"
    DB_User = "nagios"

    # Database password
    # DB_Pass = "string"
    DB_Pass = "nagios"

    # Database name
    # DB_Name = "string"
    DB_Name = "nagios"

    # Database hostname
    # DB_Host = "string"
    DB_Host = "127.0.0.1"

    The string:
    2006/03/06 11:15:16 [ storage.c:95 27013 ]
    storage_mysql module successfully loaded

    Is printed to the screen every time.

     
  • Luis M

    Luis M - 2006-03-06

    Logged In: YES
    user_id=239796

    Ok, I'm making progress on this.
    I took the .src.rpm package from dev.mysql.com for 5.0.18
    and compiled on the localhost (rpmbuild --rebuild ...). Once
    that was done, recompiled the perl-DBD-mysql module and
    restarted nagios. Then recompiled perfparse against this new
    mysqlclient library and lo and behold. It worked fine. At
    least for the all_bin=1&group_name=**ALL** page. Some graph
    work and others do not.

    For the graphs that don't work, i tried purging the db for
    the host:

    $> /usr/bin/perfparse-db-purge

    An error occured with the SQL:

    "DELETE perfdata_service FROM
    perfdata_service,perfdata_host WHERE
    perfdata_service.host_name = perfdata_host.host_name AND
    perfdata_host.is_deleted = 1"

    Failure Message:

    "Cannot delete or update a parent row: a foreign key
    constraint fails (`nagios/perfdata_service_raw`, CONSTRAINT
    `perfdata_service_raw_ibfk_1` FOREIGN KEY (`host_name`,
    `service_description`) REFERENCES `perfdata_service`
    (`host_name`, `service_description`))"

    That got me to a dead end. Not sure how to fix this.

    I tried copying and pasteing the URL for one of the binary
    graphs that don't work (the one that segfaults):

    QUERY_STRING='graph=1&host=DNS1&service=DNS+Checks&metric=%5F'
    gdb /usr/lib/nagios/cgi/perfparse.cgi

    <BODY BGcolor="#EEFFFF" TEXT="#000000" LINK="#000000"
    VLINK="#000000" ALINK="#000000" onload="isAbsRelVisible()">

    Program received signal SIGSEGV, Segmentation fault.

    [Switching to Thread -1209047360 (LWP 9549)]

    0x00b34ad9 in isNull (iCol=0) at dbms.c:72

    warning: Source file is more recent than executable.

    72

    (gdb) bt

    #0 0x00b34ad9 in isNull (iCol=0) at dbms.c:72

    #1 0x08053c7e in getRange (dMaxRangeLocal=0x8191478,
    dMinRangeLocal=0x8191498)

    at cgi_graph.c:723

    #2 0x080547f9 in displayGraphHeader () at cgi_graph.c:409

    #3 0x0804bf07 in main (argc=1, argv=0xbf805274) at
    perfgraph.c:162

    And saw the error comes from dbms.c:72. I changed those
    lines to do:

    63 int iData(int iCol)

    64 {

    65 if (result_row==NULL)

    66 return 0;

    67 if (result_row[iCol] && result_row[iCol][0])

    68 return atoi(result_row[iCol]);

    69 else

    70 return 0;

    71 }

    72

    73 int isNull(int iCol)

    74 {

    75 if (result_row==NULL)

    76 return FALSE;

    77

    78 if (result_row[iCol] == NULL)

    79 return TRUE;

    80 return FALSE;

    81 }

    82

    83 char *sData(int iCol)

    84 {

    85 if (result_row==NULL)

    86 return "";

    87

    88 if (result_row[iCol] && result_row[iCol][0])

    89 return result_row[iCol];

    90 else

    91 return "";

    92 }

    After that the program doesn't crash, but gives me blank
    graphs ;-)

    Inching closer...

     
  • Luis M

    Luis M - 2006-03-06

    Logged In: YES
    user_id=239796

    Got the latest code from CVS and it compiled fine and fixed
    all my problems.

    You should probably release a bug-fix release or a major
    release as soon as possible.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks