URL cycling with staggered URLs

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

The PROBLEM REPORTING FORM
makes our support more effective.

Please, subscribe to our mailing list here:
https://lists.sourceforge.net/lists/listinfo/curl-loader-devel

and mail the form to the mailing list:
cur...@li...

CURL-LOADER VERSION: 0.32, released 21/06/2007

HW DETAILS: CPU/S and memory are must:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping        : 9
cpu MHz         : 2394.071
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe up
cid xtpr
bogomips        : 4792.96
clflush size    : 64

MemTotal:      1030596 kB
MemFree:        711240 kB
Buffers:          6860 kB
Cached:         228640 kB
SwapCached:          0 kB
Active:         108880 kB
Inactive:       185748 kB
HighTotal:      126960 kB
HighFree:         3472 kB
LowTotal:       903636 kB
LowFree:        707768 kB
SwapTotal:     1806328 kB
SwapFree:      1803836 kB
Dirty:              52 kB
Writeback:           0 kB
AnonPages:       59152 kB
Mapped:          41444 kB
Slab:            17744 kB
SReclaimable:     9248 kB
SUnreclaim:       8496 kB
PageTables:       1228 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   2321624 kB
Committed_AS:   314736 kB
VmallocTotal:   114680 kB
VmallocUsed:      2128 kB
VmallocChunk:   112380 kB

LINUX DISTRIBUTION and KERNEL (uname -r):
BLS 1.0.072 (http://www.buraphalinux.org/)
2.6.21.5

GCC VERSION (gcc -v):
Using built-in specs.
Target: i586-pc-linux-gnu
Configured with: /tmp/gcc-4.0.4/configure --prefix=/usr
--enable-shared --enable-threads=posix --enable-__cxa_atexit
--with-gnu-as --with-gnu-ld --verbose --enable-languages=c,c++,f95
--mandir=/usr/man --infodir=/usr/info --disable-nls --disable-rpath
--build=i586-pc-linux-gnu --target=i586-pc-linux-gnu
--host=i586-pc-linux-gnu
Thread model: posix
gcc version 4.0.4

COMPILATION AND MAKING OPTIONS (if defaults changed):
I had to apply this patch:

--- curl-loader-0.32.orig/Makefile      2007-06-11 19:40:03.000000000 +0700
+++ curl-loader-0.32/Makefile   2007-06-22 22:10:46.000000000 +0700
@@ -74,7 +74,7 @@
 LDFLAGS=-L./lib -L$(OPENSSLDIR)/lib

 # Link Libraries. RedHat/FC require sometimes lidn
-LIBS= -ldl -lpthread -lrt -lidn -lcurl -levent -lz -lssl -lcrypto #-lcares
+LIBS= -ldl -lpthread -lrt -lcurl -levent -lz -lssl -lcrypto #-lcares -lidn

 # Include directories
 INCDIR=-I. -I./inc -I$(OPENSSLDIR)/include

make OPT_FLAGS="-O2 -march=i586 -mtune=i686 -fno-strict-aliasing"

COMMAND-LINE:
curl-loader -f monster.conf -v -u

CONFIGURATION-FILE (The most common source of problems):

*************>  I changed URLs often, but was always using 6;
                     problem was noticed when I had two 650MB ISO
                     images in the list, one for ftp and one for http; I
                     already changed this file before I knew you
                     needed it, but only changed the URLs <***********

Place the file inline here:
########### GENERAL SECTION ################################

BATCH_NAME= monster
CLIENTS_NUM_MAX=50 # Same as CLIENTS_NUM
CLIENTS_NUM_START=10
CLIENTS_RAMPUP_INC=10
INTERFACE=eth0
NETMASK=32
IP_ADDR_MIN=10.16.68.197
IP_ADDR_MAX=10.16.68.197
CYCLES_NUM=-1
URLS_NUM=6

########### URL SECTION ####################################

URL=http://10.16.68.186/ftp/openoffice/stable/2.2.1/OOo_2.2.1_Win32Intel_install_wJRE_en-US.exe
FRESH_CONNECT=1
URL_SHORT_NAME="url 1"
REQUEST_TYPE=GET
TIMER_URL_COMPLETION = 0 # In msec. When positive, Now it is enforced
by cancelling url fetch on timeout
TIMER_AFTER_URL_SLEEP =1000
TIMER_TCP_CONN_SETUP=50

URL=ftp://anonymous:joe%040@10.16.68.186/debian/pool/main/g/gimp/gimp_2.2.15.orig.tar.gz
FRESH_CONNECT=1
URL_SHORT_NAME="url 2"
TIMER_URL_COMPLETION = 0 # In msec. When positive, Now it is enforced
by cancelling url fetch on timeout
TIMER_AFTER_URL_SLEEP =1000
TIMER_TCP_CONN_SETUP=50

URL=http://10.16.68.186/ftp/ruby/1.8/ruby-1.8.6.tar.bz2
FRESH_CONNECT=1
URL_SHORT_NAME="url 3"
REQUEST_TYPE=GET
TIMER_URL_COMPLETION = 0 # In msec. When positive, Now it is enforced
by cancelling url fetch on timeout
TIMER_AFTER_URL_SLEEP =1000
TIMER_TCP_CONN_SETUP=50
URL=ftp://anonymous:joe%040@10.16.68.186/apache/ant/binaries/apache-ant-1.7.0-bin.tar.bz2
FRESH_CONNECT=1
URL_SHORT_NAME="url 4"
TIMER_URL_COMPLETION = 0 # In msec. When positive, Now it is enforced
by cancelling url fetch on timeout
TIMER_AFTER_URL_SLEEP =1000
TIMER_TCP_CONN_SETUP=50

URL=http://10.16.68.186/ftp/ftp.postgresql.org/postgresql-8.2.4.tar.bz2
FRESH_CONNECT=1
URL_SHORT_NAME="url 5"
REQUEST_TYPE=GET
TIMER_URL_COMPLETION = 0 # In msec. When positive, Now it is enforced
by cancelling url fetch on timeout
TIMER_AFTER_URL_SLEEP =1000
TIMER_TCP_CONN_SETUP=50

URL=ftp://anonymous:joe%040@10.16.68.186/apache/httpd/httpd-2.2.4.tar.bz2
FRESH_CONNECT=1
URL_SHORT_NAME="url 6"
TIMER_URL_COMPLETION = 0 # In msec. When positive, Now it is enforced
by cancelling url fetch on timeout
TIMER_AFTER_URL_SLEEP =1000
TIMER_TCP_CONN_SETUP=50


DOES THE PROBLEM AFFECT:
      COMPILATION? No
      LINKING? No
      EXECUTION? Yes
      OTHER (please specify)?
         See QUESTION below
Have you run $make cleanall  prior to $make ? No

DESCRIPTION:
I have noticed the disk drive on my server is not active much during testing
with curl-loader.  I looked at the curl-loader log file and I think I
know what is happening, but not how to change it.  Let me describe
what I think it is doing, and then what I would like it to do.

What do I think it is doing now?
    If I cycle through N URLs with 100 clients, the curl-loader will setup
    all 100 clients to process the first URL, then it has them all do the
    second URL, then it has them all do the third URL, etc.   This means
    that all clients are normally fetching the same file (I am using 100MB
    files for testing).  This means that I am testing networking but all
    clients are pulling the same file so all but one of them are just
pulling from
    the cached copy.  It also stresses either http or ftp (whatever the current
    URL is) but not both.  Am I wrong?

QUESTION/ SUGGESTION/ PATCH:

What I want
    If I have N URLs and many clients, I would like curl-loader to
    if (process % N) == 0 then start on URL 0
    if (process % N) == 1 then start on URL 1
    if (process % N) == 2 then start on URL 2
   (and then if I have more processes than URLs, wrap back to URL 0
   when I reach URL N-1)

Why do I want this?
    This means that if I have a large set of URLs (too big for server
file cache),
    I can force the server to work hard at loading files from disk and get
    a more realisitic load for my server (which will be a mirror archive that
    I expect many people to use as their mirror source).  This means that
    normally the file a client wants is probably NOT in cache, and with
    a collection of ISO images the filesystem cache will not be able to
    hold everything and the disk will be busy.

Can curl-loader do this already?
   Maybe curl-loader can already do this, but I could not find this in the
   sample configurations or the README, and I could not find a man page
   to read.  It's probably in there somewhere and I missed it?

   My workaround is to run many separate instances of curl-loader at once
   instead of one large, combined load, but then all the statistics
and logs are
   separate.