Unexpected hammer stoppage

Help
Paul Edgar
2001-08-23
2001-08-29
  • Paul Edgar

    Paul Edgar - 2001-08-23

    I am attempting to run hammer agains IBM's websphere aplication and the trade2 sample. 

    I got the test to start but after a short period of time, usually within the hour the test stops after receiving 2 connection timeouts in a row. As shown below:

    Thu Aug 23 09:22:11 2001: Pid: 2465 Scenario: Sample Test Times: 1498 48057 48778 Size: 14767 ReturnCode: 200 Server: IBM_HTTP_SERVER/1.3.19  Apache/1.3.20 (Unix)

    failed to connect (9.3.192.4) - Connection timed out  (110)
    Thu Aug 23 09:22:38 2001:2450: connect failed (110 - Connection timed out)
    Thu Aug 23 09:22:38 2001:2450: Request (Connect) failed (9.3.192.4:80)
    Thu Aug 23 09:22:38 2001: Pid: 2450 Scenario: Sample Test Times: -1 -1 -1 Size: 0 ReturnCode: -1 Server: unknown
    failed to connect (9.3.192.4) - Connection timed out  (110)
    Thu Aug 23 09:22:52 2001:2516: connect failed (110 - Connection timed out)
    Thu Aug 23 09:22:52 2001:2516: Request (Connect) failed (9.3.192.4:80)
    Thu Aug 23 09:22:52 2001: Pid: 2516 Scenario: Sample Test Times: -1 -1 -1 Size: 0 ReturnCode: -1 Server: unknown   

    I have checked the server and other test applications are  still running successfully.   I have attached the .conf and .scn file I am using.  Any help in debugging this problem is appreciated.

    .conf
    #
    # HammerHead configuration file
    # HammerHead is a CGI testing rig
    #

    #
    # Location of scenarios to be read in
    # VALUE: String (name of a directory)
    #

    Scenario_Directory /etc/hammerhead

    #
    # Log file to record all the stuff done by HH
    #

    Log_Filename /tmp/hammer.log

    #
    # Number of sessions to simulate (threads)
    #

    Sessions 100

    #
    # IP number:port of machine to be hit
    #

    # zen
    Machine_IP 9.3.192.4:80

    #
    # Average time to sleep between requests
    # VALUE: integer in microseconds
    #

    Sleep_time 1

    Run_time 259200                        

    .scn
    NSample Test
    RPOST /WebSphereSamples/TradeSample/servlet/TradeScenarioServlet
    Bmessage=fodder&size=6&time=123
    T0
    .           

    Paul Edgar

     
    • Geoff Wong

      Geoff Wong - 2001-08-23

      Your config file looks pretty straightforward,
      although you have Sleep_time set pretty low
      (1 ms).

      Can you determine in what manner hammer is stopping? Hammer should stop itself once max_failures is reached. But you haven't set it - so it shouldn't stop (perhaps you could try setting in the config file to a largish number as a test).

      Possibly crashing; if this is the case it should dump a core (perhaps ulimit -c <bigbum> to ensure this). If it is - could you send me a backtrace
      (using gdb).

      Sorry I can't be more helpful right now; I'm going skiing for a few days. Will look into it further when I get back.

      Geoff

       
    • Paul Edgar

      Paul Edgar - 2001-08-23

      Thanks for the feedback. Hope you have fun sking.

      Anyway I reduced the number of clients to 10, and added max_failures to 10 to see if I can get some more information.

      I am looking at adding a sigint to print the stats incase ctrl-c is pressed instead of just exiting.

       
    • Paul Edgar

      Paul Edgar - 2001-08-23

      Upon futher investigation, it looks as though the program is completing successfully but not running as long as we have specified with run_time.

      The function usleep accepts microseconds as an unsigned int.  If I am not mistaken, this limits the absolute sleep time to about 4 billion microseconds.  In our tests, we are looking for runtimes in days.  So the parameter of 259200 is being truncated and the actual test run is 1522 seconds.

      We think we may have a workaround.

      Instead of usleep, we thought an internal function like psleep could be implemented.

      snippet:

      psleep(unsigned sleep_secs)
      {
          for(int i = 0;i < sleep_secs; i++)
          {
             usleep(1000000)
          }
      }

      Every second the program will wake up long enough to go back to sleep.  The unsigned variable would allow a calling function to pass in 4 billion seconds.  That would be a long time to sleep.

      We will do this for our usage but think it might be a worthwhile fix to an otherwise good program.

      What do you guys think?

      Paul Edgar

       
    • Geoff Wong

      Geoff Wong - 2001-08-28

      Looks like a pretty sensible fix.
      I'll include it in the next update.

      Geoff

       
    • Paul Edgar

      Paul Edgar - 2001-08-29

      actually the code we have put in place follows:

      psleep(unsigned sleep_secs)
      {
         for(unsigned i = 0;i < sleep_secs; i++)
         {
            usleep(1000000)
         }
      }

      It is called:

      psleep(56);

      instead of the old way:

      usleep(56*1000000);

      Paul Edgar
      Linux Technology Center
      IBM Corporation
      http://sourceforge.net/projects/ltp/

       

Log in to post a comment.