Thread: Trying to simulate browser behavior.
Status: Alpha
Brought to you by:
coroberti
From: Pranav D. <pra...@gm...> - 2008-07-01 19:55:59
|
Hello, I am trying to simulate a browser behavior for accessing a front page (e.g. www.cnn.com). From traces I see that a few requests go over the same TCP connections (persistence) and in general there are a few TCP connections made for completely fetching the whole front page. I am trying to use FRESH_CONNECT to create another TCP connections. What I was expecting was that all URLs before the FRESH_CONNECT tag would go on the same connection and after the URL that has the FRESH_CONNECT tag a new connection would start. So I was expecting to see 2 GETs for the first TCP connection, 6 on the second TCP conn. and the rest on the third one. Basically, I was thinking of the URL list as sequential with conn. close in between. But that doesn't seem to be case. There are 3 TCP connections, but one has most of the GETs and the other 2 have one req. each (for which the tag is specified). So it seems like curl-loader loads all the URLS with it associated tags and then access them randomly. Is that correct? If so, is there a way to have a behavior similar to the one described above. config file ----------- BATCH_NAME=test_load CLIENTS_NUM_MAX=1 # Same as CLIENTS_NUM CLIENTS_NUM_START=1 CLIENTS_RAMPUP_INC=2 INTERFACE =eth1 NETMASK=16 IP_ADDR_MIN= 12.0.0.1 IP_ADDR_MAX= 12.0.16.250 #Actually - this is for self-control CYCLES_NUM=1 URLS_NUM=14 ########### URL SECTION #################################### URL=http://192.168.55.205/websites/cisco/www.cisco.com/cdc_content_elements/flash/home/sp_072307/spotlight/sp_webEx_tn.jpg URL=http://192.168.55.205/websites/cisco/www.cisco.com/cdc_content_elements/flash/home/sp_072307/spotlight/sp_CIN.swf FRESH_CONNECT=1 URL=http://192.168.55.205/websites/cisco/www.cisco.com/cdc_content_elements/flash/home/sp_072307/spotlight/sp_webEx.swf URL=http://192.168.55.205/websites/cisco/www.cisco.com/cdc_content_elements/flash/home/sp_072307/spotlight/sp_humanN_anthem_tn.jpg URL=http://192.168.55.205/websites/MOO/backfeed10.gif TIMER_AFTER_URL_SLEEP=2000-5000 URL=http://192.168.55.205/websites/MOO/bottom.gif URL=http://192.168.55.205/websites/MOO/style3.css URL=http://192.168.55.205/websites/MOO/topright.gif FRESH_CONNECT=1 URL=http://192.168.55.205/websites/MOO/rss_smaller.gif URL=http://192.168.55.205/websites/MOO/index.html.6 TIMER_AFTER_URL_SLEEP=2000-5000 URL=http://192.168.55.205/websites/cnn/www.cnn.com/.element/ssi/www/breaking_news/2.0/banner.html URL=http://192.168.55.205/websites/cnn/www.cnn.com/.element/ssi/auto/2.0/sect/MAIN/ftpartners/partner.people.html TIMER_AFTER_URL_SLEEP=2000-5000 URL=http://192.168.55.205/websites/cnn/www.cnn.com/.element/ssi/auto/2.0/sect/MAIN/ftpartners/partner.money.txt URL=http://192.168.55.205/websites/cnn/www.cnn.com/.element/img/2.0/global/icons/video_icon.gif |
From: Robert I. <cor...@gm...> - 2008-07-01 20:06:43
|
Hi Pranav, On Tue, Jul 1, 2008 at 10:55 PM, Pranav Desai <pra...@gm...> wrote: > Hello, > > I am trying to simulate a browser behavior for accessing a front page > (e.g. www.cnn.com). From traces I see that a few requests go over the > same TCP connections (persistence) and in general there are a few TCP > connections made for completely fetching the whole front page. Y can see also the behavior of major browsers like IE-6, IE-7, FF-2, FF-3 amd Safari-3.1 > > > I am trying to use FRESH_CONNECT to create another TCP connections. > What I was expecting was that all URLs before the FRESH_CONNECT tag > would go on the same connection and after the URL that has the > FRESH_CONNECT tag a new connection would start. So I was expecting to > see 2 GETs for the first TCP connection, 6 on the second TCP conn. and > the rest on the third one. Basically, I was thinking of the URL list > as sequential with conn. close in between. > > But that doesn't seem to be case. There are 3 TCP connections, but one > has most of the GETs and the other 2 have one req. each (for which the > tag is specified). > > So it seems like curl-loader loads all the URLS with it associated > tags and then access them randomly. Is that correct? If so, is there a > way to have a behavior similar to the one described above. > FRESH_CONNECT means that in the next cycle the connection should be closed and re-established. What is the behavior, that you see with major browsers mentioned? -- Truly, Robert Iakobashvili, Ph.D. ...................................................................... www.ghotit.com Assistive technology that understands you ...................................................................... |
From: Pranav D. <pra...@gm...> - 2008-07-02 01:09:00
|
On Tue, Jul 1, 2008 at 1:06 PM, Robert Iakobashvili <cor...@gm...> wrote: > Hi Pranav, > > On Tue, Jul 1, 2008 at 10:55 PM, Pranav Desai <pra...@gm...> > wrote: >> >> Hello, >> >> I am trying to simulate a browser behavior for accessing a front page >> (e.g. www.cnn.com). From traces I see that a few requests go over the >> same TCP connections (persistence) and in general there are a few TCP >> connections made for completely fetching the whole front page. > > Y can see also the behavior of major browsers like IE-6, IE-7, FF-2, FF-3 > amd Safari-3.1 >> >> I am trying to use FRESH_CONNECT to create another TCP connections. >> What I was expecting was that all URLs before the FRESH_CONNECT tag >> would go on the same connection and after the URL that has the >> FRESH_CONNECT tag a new connection would start. So I was expecting to >> see 2 GETs for the first TCP connection, 6 on the second TCP conn. and >> the rest on the third one. Basically, I was thinking of the URL list >> as sequential with conn. close in between. >> >> But that doesn't seem to be case. There are 3 TCP connections, but one >> has most of the GETs and the other 2 have one req. each (for which the >> tag is specified). >> >> So it seems like curl-loader loads all the URLS with it associated >> tags and then access them randomly. Is that correct? If so, is there a >> way to have a behavior similar to the one described above. > > FRESH_CONNECT means that in the next cycle the connection should be closed > and re-established. > > What is the behavior, that you see with major browsers mentioned? > In general, most of them will create a bunch of TCP conn. and send off multiple requests through each connection. I can send you a trace if you like, and I am not trying to simulate any particular browser, just the way a browser normally fetches the main page of a website. To bring this in context, I am trying to load test a proxy, and would like to create/simulate thousands of users opening the main page of a bunch of popular websites. What I do is get a trace on the browser side for a website, from which I can get the URLs and sequence in which the browser fetched them to get whole page. I will also get the number of connections it utilized for the entire page. With that information I can create a curl-loader conf file with the same URLs and add a few FRESH_CONNECT to emulate the new TCP conn. Thats how I thought FRESH_CONNECT would work ... I could just add a bunch of URLs from somewhere in the curl-loader conf and add a few FRESH_CONNECT and TIMER_AFTER_SLEEP in the list and would probably get a similar behavior, but I was hoping to replicate the browser as closely as possible. Thanks for your help. -- Pranav > -- > Truly, > Robert Iakobashvili, Ph.D. > ...................................................................... > www.ghotit.com > Assistive technology that understands you > ...................................................................... > ------------------------------------------------------------------------- > Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! > Studies have shown that voting for your favorite open source project, > along with a healthy diet, reduces your potential for chronic lameness > and boredom. Vote Now at http://www.sourceforge.net/community/cca08 > _______________________________________________ > curl-loader-devel mailing list > cur...@li... > https://lists.sourceforge.net/lists/listinfo/curl-loader-devel > > |
From: Robert I. <cor...@gm...> - 2008-07-02 04:08:07
|
Hi Prahav, On Tue, Jul 1, 2008 at 10:55 PM, Pranav Desai <pra...@gm...> wrote: > Hello, > > I am trying to simulate a browser behavior for accessing a front page > (e.g. www.cnn.com). From traces I see that a few requests go over the > same TCP connections (persistence) and in general there are a few TCP > connections made for completely fetching the whole front page. > > config file > ----------- > BATCH_NAME=test_load > CLIENTS_NUM_MAX=1 # Same as CLIENTS_NUM > CLIENTS_NUM_START=1 > CLIENTS_RAMPUP_INC=2 > INTERFACE =eth1 > NETMASK=16 > IP_ADDR_MIN= 12.0.0.1 > IP_ADDR_MAX= 12.0.16.250 #Actually - this is for self-control > CYCLES_NUM=1 > URLS_NUM=14 > > ########### URL SECTION #################################### > > URL= > http://192.168.55.205/websites/cisco/www.cisco.com/cdc_content_elements/flash/home/sp_072307/spotlight/sp_webEx_tn.jpg > URL= > http://192.168.55.205/websites/cisco/www.cisco.com/cdc_content_elements/flash/home/sp_072307/spotlight/sp_CIN.swf > FRESH_CONNECT=1 > > URL= > http://192.168.55.205/websites/cisco/www.cisco.com/cdc_content_elements/flash/home/sp_072307/spotlight/sp_webEx.swf > > URL=http://192.168.55.205/websites/cisco/www.cisco.com/cdc_content_elements/flash/home/sp_072307/spotlight/sp_humanN_anthem_tn.jpg > URL=http://192.168.55.205/websites/MOO/backfeed10.gif > TIMER_AFTER_URL_SLEEP=2000-5000 > URL=http://192.168.55.205/websites/MOO/bottom.gif > URL=http://192.168.55.205/websites/MOO/style3.css > URL=http://192.168.55.205/websites/MOO/topright.gif > FRESH_CONNECT=1 > > URL=http://192.168.55.205/websites/MOO/rss_smaller.gif > URL=http://192.168.55.205/websites/MOO/index.html.6 > TIMER_AFTER_URL_SLEEP=2000-5000 > > URL=http://192.168.55.205/websites/cnn/www.cnn.com/.element/ssi/www/breaking_news/2.0/banner.html > URL= > http://192.168.55.205/websites/cnn/www.cnn.com/.element/ssi/auto/2.0/sect/MAIN/ftpartners/partner.people.html > TIMER_AFTER_URL_SLEEP=2000-5000 > URL= > http://192.168.55.205/websites/cnn/www.cnn.com/.element/ssi/auto/2.0/sect/MAIN/ftpartners/partner.money.txt > URL= > http://192.168.55.205/websites/cnn/www.cnn.com/.element/img/2.0/global/icons/video_icon.gif > > I am not sure, that the syntax of URL section you are using is correct. Please, try to re-write the configuration file keeping for each URL section it's own parameters, something like below: URL=http://localhost/ACE-INSTALL.html # http://localhost/apache2-default/ACE-INSTALL.html URL_SHORT_NAME=" ACE" REQUEST_TYPE=GET TIMER_URL_COMPLETION = 0 TIMER_AFTER_URL_SLEEP =0 FRESH_CONNECT=1 URL= http://localhost/index.html URL_SHORT_NAME=" INDEX" REQUEST_TYPE=GET TIMER_URL_COMPLETION = 0 # In msec. When positive, Now it is enforced by cancelling url fetch on timeout TIMER_AFTER_URL_SLEEP =1000 FRESH_CONNECT=1 >From the HTTP point of view GET requests may still go via the same TCP/IP connection. What we should expect from FRESH_CONNECT=1 is that the connection will be closed and re-established at each loading cycle. In general connections are the matter of libcurl library, where it is keeping some 5-10 connections minimum, which is govern my MAX_CONNECTIONS flag logic. What happens with the connection policy, when you are trying e.g. 50, 100 virtual clients? -- Truly, Robert Iakobashvili, Ph.D. ...................................................................... Assistive technology that understands you ...................................................................... |