Re: [Fwd: Re: [SSI-users] frozen cluster when process migrating?]
Brought to you by:
brucewalker,
rogertsang
From: Roger T. <rog...@gm...> - 2005-11-17 02:50:50
|
Hi Maurice, The latest sk98lin driver for kernel 2.4/2.6 is at least v8.24.1.3. You would want to download the driver from syskonnect.com<http://syskonnect.com>or if this address doesn't work try online search "SysKonnect". Roger On 11/16/05, Maurice Libes <Mau...@co...> wrote: > > Roger Tsang wrote: > > Hi Maurice, > > > > Looks like you have found the problem. > > hmm let's say i have identified from where it comes but dont know why?; > > if i send to you a kernel debugging file, will you get better > informations in order to understand what is the problem with this card? > > i am nevertheless unsatisfied since i haven't solve entirely the > problem (use at gigab/s) > > for the moment i use the cluster on my old 100Mb switch > > > > > Are you using the latest sk98lin driver from syskonnect? I bet the > > driver that comes with your card is outdated. > > no i use the sk98lin driver which comes with openssi distrib ;-) > > (is there a newer version of sk98lin in openssi-1.9?) > > > /lib/modules/2.4.22-1.2199.nptl-ssi-686-smp > /kernel/drivers/net/sk98lin/sk98lin.o > > > #define BOOT_STRING "sk98lin: Network Device Driver v6.15\n" \ > "(C)Copyright 1999-2003 Marvell(R)." > > #define VER_STRING "6.15" > > i will find and try a newer version... > > does somebody use this driver sk98lin with openSSI-1.2 on a gigab > switch on the list? > i am surprised to be alone in this case > > ML > > > > > Roger > > > > > > On 11/14/05, *Maurice Libes* <Mau...@co... > > <mailto:Mau...@co...>> wrote: > > > > Roger Tsang wrote: > > > > > Hi, > > > > > > I think one way of making sure the process doesn't jump back and > > forth between node1 and node4 which may be causing your problems is use > > `migrate` command in your performance test. Surely the migrate command > > will not cause your process to jump back to node1 - as long as symphoni= e > > is not on the loadlevel list. > > > > > > > > > hi... there are some news.. > > i made some tests with an old 100Mb/s switch. > > when i change my extra new and performing netgear Gigabit switch by an > > old 100Mb switch... problems evolve : > > > > 1. migration times of symphonie process, are still long but it seems to > > be constant among each nodes and normal for a 100Mb/s speed > > > > e.g. for a process of 325MB it takes about 35s ... which is > > approximatively normal ...for a speed of 90Mb/s (i measured speeds of > > 90-95Mb/s between nodes with ttcp) =3D> 325*8 / 90 =3D 29s > > (=3D> is my flow calculation correct?) > > > > i migrate symphonie process among everynode (with the 100Mb switch) and > > this gave everytime, migration time of 30-35s in all direction.. so it'= s > > better and more regular than the 15mn when i use my gigabit switch ;-) > > > > 1131965796 mig :pid 67797(symphonie) -> node 4 mem 14680 my load 41 > > node4 load 26 > > 1131965821 mig: :pid 67797(symphonie) <- node 1 mem 404636 my load 1 > > node1 load 11 > > > > (transfert from node 1 to 4 is now 25 s) > > > > and in the opposite direction from node 4 to 1 (39s) : > > 1131965900 mig :pid 67797(symphonie) -> node 1 mem 404620 my load 98 > > node1 load 6 > > 1131965939 mig: :pid 67797(symphonie) <- node 4 mem 13980 my load 0 > > node4 load 29 > > > > > > > > what do you think of this? > > since a friend of mine has the same switch netgear gigabit on an openSS= I > > cluster, > > i guess it could be a problem in relation with the NIC driver sk98lin a= t > > Gigabits flow? or kernel?.. strange since there are no errors messages > > related to network : > > > > - netstat -i seems ok (see below) > > - ttcp gives network speeds of ~600Mb/s (rather low for a giga bits > > private network) but speeds of 90-93Mb/s with the old 100Mb/s switch > > -i looked into /proc/net/sk98lin/eth0 and every thing seemed normal > > > > > > i would want to send kernel debugging informations to you, but i never > > done that and i dont know how to do.. i don't know how to use kdb > > > > -is the openssi kernel prepared for debugging purpose? > > - is there a specific package to get on debian? > > - have i just to reboot with kdb=3Don on the boot line parameters on ea= ch > > nodes? > > and then what? what must i do in order to get the debug lines? i did'nt > > find the kdb command > > > > sorry to ask you that..is there a howto for kernel debugging? > > > > thanks > > > > > > ML > > > > > > > > > > Roger > > > > > > PS. Thanks. postcard, wine? It is too much. :-) > > > > > > > > > On 11/10/05, *Maurice Libes* < Mau...@co... > > <mailto:Mau...@co...> > > <mailto:Mau...@co... > > <mailto:Mau...@co...>>> wrote: > > > > > > Roger Tsang wrote: > > > > Hi, > > > > > > > > Maybe you have saturated node1's bandwidth with the migrations > > going > > > > on. > > > > > > hmm i don't think so (there was only one process running, > > takin 320Mb > > > RAM and 50% CPU)... but to eliminate this possibility i will > > force one > > > migration again with node 1 free from charge, right now, below > > > > > > > I assume you were migrating multiple instances of symphonie? > > > > > > no just only ONE! (there was one big process locked on node 1 > > (init > > > node), and another one (symphonie) on node 4 for which i > > forced the > > > migration towards node 1 or 2 or 3 > > > > > > What > > > > is the traffic on node1's network card before/during/after > > symphonie > > > > migration from node1? Is there enough remaining > > > > bandwidth/cpu/interrupts on node1 to support your symphonie > > > migration? > > > > > > > > Can you not run anything on the cluster that can > > load-balance when > > > > testing the migration problem? > > > > > > yes i make this test, right now > > > (nothing on node 1) and i launch symphonie on node 4... > > > > > > i type loadlevel -p 265684 and ther's no need to force > > migration, > > it is > > > loadleveled some seconds later...because node 1 is better > > > > > > 1131646876 loadlb:pid 265684(symphonie) <- node 4 mem 12924 > > my load 0 > > > node4 load 72 > > > > > > 1131646852 loadbl:pid 265684(symphonie) -> node 1 mem 74028 my > > load 96 > > > node1 load 11 > > > > > > 1131646852-1131646876 =3D 24 s (better than 180s or total freeze, > > but not > > > totally satisfaisant,.. it should take 5-6 seconds) > > > > > > process is taking 325M in RAM, and swap space not altered > > > > > > PID NODE USER PR NI VIRT RES SHR S > > %CPU TIME+ COMMAND > > > 265684 1 root 25 0 325m 323m 524 R 99.9 30:17.41 > > > symphonie > > > root@comclust5:~# free > > > total used free shared buffers > > cached > > > Mem: 1025816 1012068 13748 0 49504 > > 200804 > > > -/+ buffers/cache: 761760 264056 > > > Swap: 4096564 58688 4037876 > > > > > > > > > > > > here is the ttcp test when process is runing on node 1 > > > nttcp -T -n 819200 -r comclust5 > > > > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > Calls Real-C/s > > > CPU-C/s > > > l-939524096 47.27 42.50 567.9044 631.6128 > > 1172745 24810.69 > > > 27594.0 > > > 1-939524096 47.27 12.39 567.8948 > > 2166.5493 819200 17330.77 > > > 66117.8 > > > > > > nttcp -T -n 819200 comclust5 > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > Calls Real-C/s > > > CPU-C/s > > > l-939524096 42.45 18.23 632.4208 > > 1472.4929 819200 19299.95 > > > 44936.9 > > > 1-939524096 42.45 35.63 632.3907 753.3973 > > 1817301 42812.68 > > > 51004.8 > > > > > > seems to be 560-632Mbit/s in both directions > > > > > > >Try running one instance of symphonie on > > > > node1. Then make it migrate to node2. > > > > > > yes i do all you want... symphonie is on node 1 (since late > > > migration above) > > > > > > cat /proc/265684/where > > > 1 > > > > > > > > > migrate 4 265684 > > > > > > =3D=3D> here it is... we go to hell ... it's frozen ! (look until > > the end) > > > > > > $ top > > > =3D=3D> nothing > > > > > > > > > $ onnode 4 cat /proc/cluster/loadlog |grep sympho > > > > > > [this is the last line logged.. which dont correspond to the > > migrate > > > operation] > > > > > > 1131646852 loadbl:pid 265684(symphonie) -> node 1 mem 74028 my > > load 96 > > > node1 load 11 > > > > > > ssh comcluster -l root > > > Password:***** > > > =3D> no answer (frozen) > > > > > > > See how fast/slow that is and > > > > how much traffic it took by looking at `netstat -i` output. > > > > > > at the same time (netstat -i on node 4) (on node 1 i can't) > > > > > > root@comclust4# netstat -i > > > Kernel Interface table > > > Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR > > TX-DRP > > > TX-OVR Flg > > > eth0 1500 > > 0 63443696 0 0 064300702 0 0 > > > 0 BMRU > > > lo 16436 0 30658 0 0 0 > > 30658 0 0 > > > 0 LRU > > > > > > good or not? > > > > > > it's 20:24 ... 12 minutes later the migrate command is still > > on the > > > flight ... i still don't have the control of the cluster...(no > > top ps w > > > or ssh) > > > > > > > > > be patient.. it comes.. 16mn later ...i have the control at > > keyboard > > > top succeeds and displays > > > it is interesting.. symphonie hasn't migrated on node 4... > > > at the end of these 16mn, it is still on node 1 > > > look > > > > > > onnode 1 > > > 1131650025 mig :pid 265684(symphonie) -> node 4 mem 15064 my > > load 41 > > > node4 load 26 > > > 1131651008 loadlb:pid 265684(symphonie) <- node 4 mem 13480 > > my load 0 > > > node4 load 58 > > > > > > onnode 4 > > > 1131650982 mig: :pid 265684(symphonie) <- node 1 mem 63840 > > my load 0 > > > node1 load 26 > > > 1131650985 loadbl:pid 265684(symphonie) -> node 1 mem 63860 my > > load 67 > > > node1 load 25 > > > > > > 1131650982-1131650025 =3D 957s =3D 16mn to reach node 4 > > > > > > then at 3s later (1131650985) it is loadbalanced again towards > > node 1 , > > > where it arrives 23s later (the same as the first test above) > > > 1131650985-1131651008 =3D 23 s > > > > > > > > > i resume my tests: > > > i) from node 1 to nodes 2 3 4.. it freezes during about 16mn, the > > > process dont stay on the node where i want to migrate it) > > > > > > ii) among nodes 2 3 4 : it takes about ~180 sec > > > > > > iii) from nodes 2 3 4 to node 1 : it is the best time : 25seconds > > > > > > onnode 1 netstat -i > > > Kernel Interface table > > > Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR > > TX-DRP > > > TX-OVR Flg > > > eth0 1500 > > 0 233943933 0 0 0228886098 0 0 > > > 0 BMRU > > > eth1 1500 0 3478874 0 0 0 > > 5859896 0 0 > > > 0 BMRU > > > lo 16436 0 13882 0 0 0 > > 13882 0 0 > > > 0 LRU > > > > > > i think you have all the test? and symptoms > > > > > > > > > > How about > > > > if you had node1 and node2 directly connected (without > > switch) > > during > > > > these tests? > > > > > > hmm...a friend of mine has the same netgear 8ports Giga switch... > > with > > > openSSI and my processe take 5sec on its cluster (on FC2) > > > i bought last week a new swtich (same netgear model but new)..it > > is the > > > same! > > > > > > but...i will try tomorrow with a cross cable directly between > > nodes > > > 1 and 4 > > > i also will send you , the result of kdb when done... > > > > > > and? what about the sk98lin driver for my NIC 3Com 3C2000T ? > > > > > > > > > many thanks > > > > > > ML > > > > > > > > > PS: and.... if we solve this problem (i fear it will be a > > silly thing > > > when we find it), please give me your postal addresses > > > i will be pleased to send you a postal card from marseille as > > thanks > > > ... and if you like french red wine a bottle of good wine? > > > > > > > > > > > Roger > > > > > > > > > > > > On 11/8/05, *Maurice Libes* <Mau...@co... > > <mailto:Mau...@co...> > > > <mailto: Mau...@co... > > <mailto:Mau...@co...>> > > > > <mailto:Mau...@co... > > <mailto:Mau...@co...> > > > <mailto:Mau...@co... > > <mailto:Mau...@co...>>>> wrote: > > > > > > > > Roger Tsang wrote: > > > > > Hi Maurice, > > > > > > > > > > Have you tried monitoring the network? > > > > > > > > > > There is one thing you can do to get more > > information. If > > > you have > > > > > serial console, you can enable kdb (boot with > > kdb=3Don) or > > > `echo 1 > > > > > > /proc/sys/kernel/kdb`, place all nodes into kdb > > (Ctrl-a-a > > > on console) > > > > > when this happens, and send the developers (Laura > > or me) > > > your "bta A" > > > > > dump - preferably bzip'ed. Then we can tell you > > exactly > > > what froze. > > > > > > > > > > Roger > > > > > > > > ok thanks for your help > > > > i will try, and send the informations you need, back > > to you > > > (not before > > > > thursday) > > > > > > > > concerning the monitoring of the network i have used > > nttcp on > > > debian... > > > > you will find enclosed the results of the test between > > node 1 and > > > > 2(diag.txt) > > > > > > > > john Hughes said they seems normal... about 600Mb/s in > > both > > > direction > > > > > > > > > > > > i can send also the logs from /proc/cluster/loadlog > > from each > > > node > > > > we see the processes loadbalanced.. may be it can be from > > > some interest > > > > to you > > > > > > > > if i resume my problem, it seems to me that: > > > > 1.migration time is very long (often about 180 > > seconds) when > > > migrating > > > > from node 1 to 2 3 4 (or among 2 3 4) (that 's long but it > > > succeeds..) > > > > but > > > > 2. when migrating from node 1 to nodes 2 3 or 4 ...the > > system > > > is often > > > > blocked (cant' type commands from procps package top, > > ps, w..) > > > > and i must reboot the destination node in order to > > retrieve > > > nominal > > > > conditions > > > > > > > > look an example right now, if i force the migration: > > > > > > > > (pid 198028 is on node 2) > > > > > > > > $root@comclust5:~# migrate 1 198028 > > > > > > > > > > > > onnode 2 cat /proc/cluster/loadlog | grep symph > > > > 1131470437 mig :pid 198028(symphonie) -> node 1 mem > > 53292 > > > my load 96 > > > > node1 load 41 > > > > > > > > > > > > root@comclust5:~# top > > > > =3D=3D> nothing (since 10 minutes) > > > > > > > > root@comclust4:~# onnode 1 cat /proc/cluster/loadlog > > | grep > > > symphoni > > > > > > > > 1131470462 mig: :pid 198028(symphonie) <- node 2 mem > > 13368 > > > my load 25 > > > > node2 load 31 > > > > (seems to have reached node 1, 25 sec later , > > that's good) > > > > > > > > 1131470463 loadbl:pid 198028(symphonie) -> node 2 mem > > 12948 > > > my load 53 > > > > node2 load 56 > > > > (then leaves node 1 immediately due to > > loadbalancing > > > (there's > > > > something running on node 1)) > > > > > > > > > > > > during this time the top command is frozen... > > > > > > > > ....18 minutes later (1131471541).....the "top" command > > > displays on the > > > > screen > > > > > > > > i retrieve the process on node 2... i never saw it on > > node 1 > > > > > > > > onnode 2 cat /proc/cluster/loadlog | grep symph > > > > > > > > 1131470437 mig :pid 198028(symphonie) -> node 1 mem > > 53292 > > > my load 96 > > > > node1 load 41 > > > > 1131471541 loadlb:pid 198028(symphonie) <- node 1 mem > > 53092 > > > my load 1 > > > > node1 load 81 > > > > > > > > if i understand the path of these process, last logs > > were: > > > > leaving node 1 at 1131470463 and reaches node 2 at > > 1131471541 > > > (1078 sec > > > > =3D 18 mn) > > > > > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3Dother log : m= igration from node 2 to > > > 4=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > > > > > 1131472179 mig :pid 198028(symphonie) -> node 4 mem > > 51940 > > > my load 96 > > > > node4 load 26 > > > > 1131472332 mig: :pid 198028(symphonie) <- node 2 mem > > 227792 > > > my load 0 > > > > node2 load 27 > > > > > > > > migration time : 1131472332-1131472179 =3D 153 s ~ 3mn > > > > > > > > 180s : for a process of 325 Mb on a Gigabit private > > network > > > at about > > > > 600Mbits/s > > > > 198028 4 gatti 25 0 325m 323m 512 R 99.9 > > 1752:12 > > > > symphonie > > > > > > > > don't understand what's wrong > > > > > > > > > > > > may be i have done something wrong? or i have a bad > > hardware > > > > (i plan to buy some better computers, and i will see the > > > improvement) > > > > bad module? bad driver? dont' know > > > > > > > > nota:a friend of mine has an openSSI cluster (on FC3 and > > better > > > > computers PIV at 3.2Ghz , same 1Gb switch) > > > > and doesn't have this problem ! (with the same computing > > process > > > > "symphonie") > > > > > > > > > > > > ML > > > > > > > > > > > > > > > > > > > > > > > On 11/8/05, *Maurice Libes* > > <Mau...@co... <mailto:Mau...@co...> > > > <mailto:Mau...@co... > > <mailto:Mau...@co...>> > > > > <mailto:Mau...@co... > > <mailto:Mau...@co...> > > > <mailto:Mau...@co... > > <mailto:Mau...@co...>>> > > > > > <mailto:Mau...@co... > > <mailto:Mau...@co...> > > > <mailto:Mau...@co... > > <mailto:Mau...@co...>> > > > > <mailto:Mau...@co... > > <mailto:Mau...@co...> > > > <mailto:Mau...@co... > > <mailto:Mau...@co...>>>>> wrote: > > > > > > > > > > Mulyadi Santosa wrote: > > > > > > Dear Maurice > > > > > > > > > > > > > > > > > >>thanks for your help and analyze.. > > > > > >>i really don't know why there is such a long > > time for > > > > the process > > > > > >>migration between some of my nodes (from node 1 > > towards > > > > nodes 2 3 4) > > > > > > > > > > > > > > > > > > Previously, you said you were running big > > > application, how > > > > "big" > > > > > is it? > > > > > > can you tell us the size of the application? And > > > how big > > > > the virtual > > > > > > size is (+dynamic library+heap). You can use > > "pmap" > > > to see it > > > > > > > > > > > > > > > > here are two of these computing processes > > (symphonie, > > > > bio_mars.exe) and > > > > > occupied RAM ... 350Mb > > > > > > > > > > Tasks: 130 total, 3 running, 127 sleeping, 0 > > > stopped, 0 > > > > zombie > > > > > > > > > > PID NODE USER PR NI VIRT RES SHR S > > > > %CPU TIME+ COMMAND > > > > > 198028 3 gatti 25 0 328m 321m 13m R > > > 99.4 1302:32 > > > > > symphonie > > > > > 264695 1 faure 25 0 353m 251m 23m R > > > 98.7 1512:52 > > > > > bio_mars.exe > > > > > > > > > > > > > > > > > > > > >>(node 1 is a recent machine Dell precision > > 2.8Ghz > > > 1Gb RAM) > > > > > >>(nodes 2 3 4 are old Dell PIII 1Ghz 512Mb RAM) > > > > > > > > > > > > > > > > > > On node 1 itself...when you run the process > > alone > > > (well, > > > > along with > > > > > > neccessary daemon and kernel threads of course), > > > how much > > > > do you use > > > > > > the RAM usage? it the application swaps, how > > much swap > > > > space it uses? > > > > > > > > > > > here is the "free" command output on node 1 > > > > > > > > > > root@comclust5:~# free > > > > > total used free > > > > shared buffers cached > > > > > 1025816 1012828 12988 0 > > 5660 > > > > > 400496 > > > > > -/+ buffers/cache: 606672 419144 > > > > > Swap: 4096564 184556 3912008 > > > > > > > > > > > > > > > i create a swap space on each node, but swap is not > > > very used on > > > > > each node > > > > > > > > > > root@comclust5:~# onnode 2 free > > > > > total used free > > shared buffers > > > > > cached > > > > > Mem: 509788 296508 > > > > > 213280 0 160 47588 > > > > > -/+ buffers/cache: 248760 261028 > > > > > Swap: 522104 1168 520936 > > > > > > > > > > root@comclust5:~# onnode 3 free > > > > > total used free > > shared buffers > > > > > cached > > > > > Mem: 509788 504120 5668 > > 0 160 > > > > > 108696 > > > > > -/+ buffers/cache: 395264 114524 > > > > > Swap: 1469908 19204 1450704 > > > > > > > > > > root@comclust5:~# onnode 4 free > > > > > total used free > > shared buffers > > > > > cached > > > > > Mem: 1026356 434208 592148 > > 0 160 > > > > > 231624 > > > > > -/+ buffers/cache: 202424 823932 > > > > > Swap: 1469908 0 1469908 > > > > > > > > > > > > > > > > > > > > > > > > > > >>note that there's no problem with little > > benchmark a > > > > big loop with > > > > > >>awk, or plenty of mp32ogg processes > > > > > > > > > > > > > > > > > > OK here comes my prediction. Page migration is > > still on > > > > the way, but > > > > > > since you said it is "big", they are still "in > > > flight". Note > > > > > that, when > > > > > > page arrive, they are still need to be allocated > > first > > > > (possibly in > > > > > > blocking style...as alloc_pages() usually does). > > > > > > > > > > > > Maurice, maybe you compare it to your experience > > > with oM? > > > > Now you > > > > > will > > > > > > get clearer picture of the difference between > > these two > > > > (oM and > > > > > > openSSI). Since openSSI implement full > > process image > > > > migration, be > > > > > > prepared to watch longer interval during process > > > > migration. The term > > > > > > "longer" here is relative, it could be a bit, or > > > waaayyyy > > > > longer. > > > > > > > > > > yes i noticed that in normal conditions the process > > > migration > > > > time was > > > > > longer than in oM... > > > > > but in my case, one can not say it is long or > > > longer... it simply > > > > > freezes all following commands (no more top, ps, w, > > > command > > > > during 10 20 > > > > > 30 minutes... it is not "in flight" ;-)) > > > > > > > > > > i can now reproduce my problem... but i still dont > > > know how > > > > to solve > > > > > it... > > > > > > > > > > i) when my big processes are migrating from > > nodes 2 3 > > > 4 to > > > > node 1 (init > > > > > node) there is no problem > > > > > ii) when processes are migrating among node 2 3 > > 4 .. > > > there is no > > > > > problem > > > > > > > > > > but, > > > > > iii) when of one these processes is migrating from > > node 1 > > > > towards nodes > > > > > 2 3 4... the problem occurs...the process seems to > > > leave node > > > > 1 (i see a > > > > > log in /proc/cluster/loadlog) ... but never > > reaches the > > > > destination > > > > > node > > > > > (no log in /proc/cluster/loadlog of the destination > > > node)... > > > > > i must reboot the destination node in order to get > > back to > > > > the nominal > > > > > conditions (process comes back or stay on node 1) > > > > > > > > > > may be a network problem? but why? > > > > > since my 5 NIC cards are new (3Com 3C2000 gigabit > > with > > > > sklin98 drivers > > > > > on Debian) as my switchs Netgear 8port gigabit > > > > > > > > > > any ideas? > > > > > > > > > > ML > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My suggestion for openSSI developers is to > > > implement something > > > > > > differential page migration based on remote > > demand > > > paging. > > > > page is > > > > > > migrated on demand...and only those which is > > recently > > > > dirtied. I got > > > > > > lost when tracing the internal codes of openSSI > > > handling these > > > > > stuffs, > > > > > > so any hints are welcome here > > > > > > > > > > > > regards > > > > > > > > > > > > Mulyadi > > > > > > > > > > > > > > > -- > > > > > Maurice Libes > > > > > Tel : +33 (04) 91 82 93 25 Centre > > > d'Oceanologie de > > > > Marseille > > > > > Fax : +33 (04) 91 82 65 48 UMS2196 CNRS- > > > Campus de > > > > Luminy, > > > > > Case 901 > > > > > mailto:mau...@co... > > <mailto:mau...@co...> > > > <mailto:mau...@co... > > <mailto:mau...@co...>> > > > > <mailto: mau...@co... > > <mailto:mau...@co...> > > > <mailto:mau...@co... > > <mailto:mau...@co...>>> > > > > > <mailto:mau...@co... > > <mailto:mau...@co...> > > > <mailto:mau...@co... > > <mailto:mau...@co...>> > > > > <mailto: mau...@co... > > <mailto:mau...@co...> > > > <mailto:mau...@co... > > <mailto:mau...@co...>>>> F-13288 Marseille cedex 9 > > > > > Annuaire : > > > http://annuaire.univ-aix.fr/showuser.php?uid=3Dlibes > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Maurice Libes > > > > Tel : +33 (04) 91 82 93 25 Centre > > d'Oceanologie de > > > Marseille > > > > Fax : +33 (04) 91 82 65 48 UMS2196 CNRS- > > Campus de > > > Luminy, > > > > Case 901 > > > > mailto:mau...@co... > > <mailto:mau...@co...> > > > <mailto: mau...@co... > > <mailto:mau...@co...>> > > > > <mailto:mau...@co... > > <mailto:mau...@co...> > > > <mailto: mau...@co... > > <mailto:mau...@co...>>> F-13288 Marseille cedex 9 > > > > Annuaire : > > http://annuaire.univ-aix.fr/showuser.php?uid=3Dlibes > > > > > > > > > > > > ttcp between node 2 (comclust2) and node 1 (comclust5) > > init node > > > > UDP protocol > > > > > > > > root@comclust2:~# nttcp -T -n 819200 -u comclust5 > > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > > > Calls Real-C/s CPU-C/s > > > > l-939524096 37.72 18.43 711.7045 > > > > 1456.5136 819203 21719.58 44449.4 > > > > 1-1158230016 37.72 6.50 665.2546 > > > > 3860.5997 765806 20301.99 117816.3 > > > > > > > > > > > > root@comclust2:~# nttcp -T -n 819200 -u -r comclust5 > > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > > > Calls Real-C/s CPU-C/s > > > > l-941424640 56.87 23.64 471.7538 > > > > 1134.8706 818737 14396.80 34633.5 > > > > 1-939524096 56.87 25.45 472.0197 > > > > 1054.7562 819203 14404.95 32188.7 > > > > > > > > > > > > > > > > ttcp analysis between node 2 and node 1 init node > > > > **TCP protocol** > > > > > > > > root@comclust2:~# nttcp -T -n 819200 comclust5 > > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > > > Calls Real-C/s CPU-C/s > > > > l-939524096 42.73 18.72 628.1504 > > > > 1433.9501 819200 19169.63 43760.7 > > > > 1-939524096 42.74 37.26 628.1237 720.4387 > > > > 1540193 36039.64 41336.4 > > > > > > > > root@comclust2:~# nttcp -T -n 819200 -r comclust5 > > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > > > Calls Real-C/s CPU-C/s > > > > l-939524096 47.05 43.27 570.5730 620.3731 > > > > 1099498 23370.38 25410.2 > > > > 1-939524096 47.05 9.05 570.5689 > > > > 2966.1376 819200 17412.38 90519.3 > > > > > > > > > > > > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > > > > nttcp serveur on comclust4 > > > > test from init node comclust5 > > > > > > > > root@comclust5:~# nttcp -T -n 819200 comclust4 > > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > > > Calls Real-C/s CPU-C/s > > > > l-939524096 47.55 10.45 564.4774 > > > > 2568.7603 819200 17226.48 78392.3 > > > > 1-939524096 47.69 42.68 562.8214 628.9491 > > > > 1158740 24294.99 27149.5 > > > > > > > > > > > > root@comclust5:~# nttcp -T -n 819200 -r comclust4 > > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > > > Calls Real-C/s CPU-C/s > > > > l-939524096 43.17 36.08 621.7619 744.0007 > > > > 1558115 36089.74 43185.0 > > > > 1-939524096 43.17 19.99 621.7936 > > > > 1342.8487 819200 18975.63 40980.5 > > > > > > > > > > > > in UDP > > > > root@comclust5 :~# nttcp -T -n 819200 -u comclust4 > > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > > > Calls Real-C/s CPU-C/s > > > > l-939524096 45.14 22.94 594.7364 > > > > 1170.1633 819203 18149.98 35710.7 > > > > 1-939913216 45.13 > > > > 29.90 594.6789 897.6733 819106 18148.18 27394.8 > > > > > > > > root@comclust5:~# nttcp -T -n 819200 -u -r comclust4 > > > > Bytes Real s CPU s Real-MBit/s CPU-MBit/s > > > > Calls Real-C/s CPU-C/s > > > > l-1242050560 37.72 6.82 647.4045 > > > > 3581.1340 745342 19757.24 109287.7 > > > > 1-939524096 37.72 17.88 711.6422 > > > > 1501.3169 819203 21717.67 45816.7 > > > > > > > > > > > > > > > > > > > > > > > > -- > Maurice Libes > Tel : +33 (04) 91 82 93 25 Centre d'Oceanologie de Marseille > Fax : +33 (04) 91 82 65 48 UMS2196 CNRS- Campus de Luminy, > Case 901 > mailto:mau...@co... F-13288 Marseille cedex 9 > Annuaire : http://annuaire.univ-aix.fr/showuser.php?uid=3Dlibes > > > |