From: rae l <cr...@gm...> - 2008-08-29 03:41:36
|
A simple nullio configuration: Target iqn.2001-04.com.example:storage.disk2.sys1.xyzz Lun 0 Type=nullio A windows client initiator logon this target and use iometer to test it, While iometer is reading, this cpu load is observed on the target server: gektop@tux ~ $ dstat -M cpu,net 5 ----total-cpu-usage---- -net/total- usr sys idl wai hiq siq| recv send 30 4 66 0 0 0| 0 0 23 41 0 0 17 19| 246k 10M 22 36 0 0 21 21| 264k 11M 24 38 0 0 17 21| 261k 11M 24 36 0 0 21 19| 268k 11M on the desktop computer the only one nic is a 100Mb network adapter, so the network is near the limit, but the problem is: why it consumes so high CPU? on another server hardware with the same ietd.conf and dual-core CPU and 1000Mb nic, it was observed that "sys,hiq,siq" consumes all the CPU resources, too. with systemtap debugging, I found the sendpage (really tcp_sendpage) would very likely return -EAGAIN, I think this loop consumes so high CPU, but I have done a simple test, sendpage without MSG_DONTWAIT, in this situation, the target consumed very low CPU resources, average under 1%, with the same nullio performance, 110MB, nearly the 1000Mb limit, I'm not very clear about the meangings of tcp_sendpage with or without MSG_DONTWAIT, so please review this patch: Index: iscsitarget-r168/kernel/nthread.c =================================================================== --- iscsitarget-r168.orig/kernel/nthread.c +++ iscsitarget-r168/kernel/nthread.c @@ -294,7 +294,7 @@ static int write_data(struct iscsi_conn struct iovec *iop; int saved_size, size, sendsize; int offset, idx; - int flags, res; + int res; file = conn->file; saved_size = size = conn->write_size; @@ -351,12 +351,11 @@ static int write_data(struct iscsi_conn sock = conn->sock; sendpage = sock->ops->sendpage ? : sock_no_sendpage; - flags = MSG_DONTWAIT; while (1) { sendsize = PAGE_CACHE_SIZE - offset; if (size <= sendsize) { - res = sendpage(sock, tio->pvec[idx], offset, size, flags); + res = sendpage(sock, tio->pvec[idx], offset, size, 0); dprintk(D_DATA, "%s %#Lx:%u: %d(%lu,%u,%u)\n", sock->ops->sendpage ? "sendpage" : "writepage", (unsigned long long ) conn->session->sid, conn->cid, @@ -377,7 +376,7 @@ static int write_data(struct iscsi_conn continue; } - res = sendpage(sock, tio->pvec[idx], offset,sendsize, flags | MSG_MORE); + res = sendpage(sock, tio->pvec[idx], offset,sendsize, MSG_MORE); dprintk(D_DATA, "%s %#Lx:%u: %d(%lu,%u,%u)\n", sock->ops->sendpage ? "sendpage" : "writepage", (unsigned long long ) conn->session->sid, conn->cid, Thanks. -- Denis Cheng Linux Application Developer "One of my most productive days was throwing away 1000 lines of code." - Ken Thompson. |
From: Arne R. <ag...@po...> - 2008-08-29 15:55:45
|
Am Freitag, den 29.08.2008, 11:41 +0800 schrieb rae l: > A simple nullio configuration: > > Target iqn.2001-04.com.example:storage.disk2.sys1.xyzz > Lun 0 Type=nullio > > A windows client initiator logon this target and use iometer to test it, > While iometer is reading, this cpu load is observed on the target server: > > gektop@tux ~ $ dstat -M cpu,net 5 > ----total-cpu-usage---- -net/total- > usr sys idl wai hiq siq| recv send > 30 4 66 0 0 0| 0 0 > 23 41 0 0 17 19| 246k 10M > 22 36 0 0 21 21| 264k 11M > 24 38 0 0 17 21| 261k 11M > 24 36 0 0 21 19| 268k 11M > > on the desktop computer the only one nic is a 100Mb network adapter, > so the network is near the limit, > > but the problem is: why it consumes so high CPU? > > on another server hardware with the same ietd.conf and dual-core CPU > and 1000Mb nic, it was observed that "sys,hiq,siq" consumes all the > CPU resources, too. > > with systemtap debugging, I found the sendpage (really tcp_sendpage) > would very likely return -EAGAIN, I think this loop consumes so high > CPU, > > but I have done a simple test, sendpage without MSG_DONTWAIT, in this > situation, the target consumed very low CPU resources, average under > 1%, with the same nullio performance, 110MB, nearly the 1000Mb limit, > > I'm not very clear about the meangings of tcp_sendpage with or without > MSG_DONTWAIT, so please review this patch: Can you give more details about your H/W, in particular how much RAM the target has and whether you're running x86 or x86_64 (kernel)? As you observed, MSG_DONTWAIT will lead to tcp_sendpage() returning errors if it cannot get hold of memory for data transmission. Removing this flag will make tcp_sendpage() try a bit harder. I'll take a closer look at the implications of your patch, but I'd also like to get some more data points before making such modifications - anyone else willing to repeat the above tests? Thanks, Arne > Index: iscsitarget-r168/kernel/nthread.c > =================================================================== > --- iscsitarget-r168.orig/kernel/nthread.c > +++ iscsitarget-r168/kernel/nthread.c > @@ -294,7 +294,7 @@ static int write_data(struct iscsi_conn > struct iovec *iop; > int saved_size, size, sendsize; > int offset, idx; > - int flags, res; > + int res; > > file = conn->file; > saved_size = size = conn->write_size; > @@ -351,12 +351,11 @@ static int write_data(struct iscsi_conn > > sock = conn->sock; > sendpage = sock->ops->sendpage ? : sock_no_sendpage; > - flags = MSG_DONTWAIT; > > while (1) { > sendsize = PAGE_CACHE_SIZE - offset; > if (size <= sendsize) { > - res = sendpage(sock, tio->pvec[idx], offset, size, flags); > + res = sendpage(sock, tio->pvec[idx], offset, size, 0); > dprintk(D_DATA, "%s %#Lx:%u: %d(%lu,%u,%u)\n", > sock->ops->sendpage ? "sendpage" : "writepage", > (unsigned long long ) conn->session->sid, conn->cid, > @@ -377,7 +376,7 @@ static int write_data(struct iscsi_conn > continue; > } > > - res = sendpage(sock, tio->pvec[idx], offset,sendsize, flags | MSG_MORE); > + res = sendpage(sock, tio->pvec[idx], offset,sendsize, MSG_MORE); > dprintk(D_DATA, "%s %#Lx:%u: %d(%lu,%u,%u)\n", > sock->ops->sendpage ? "sendpage" : "writepage", > (unsigned long long ) conn->session->sid, conn->cid, > > > Thanks. > |
From: Arne R. <ag...@po...> - 2008-08-29 16:17:58
|
Am Freitag, den 29.08.2008, 17:55 +0200 schrieb Arne Redlich: > Am Freitag, den 29.08.2008, 11:41 +0800 schrieb rae l: > > A simple nullio configuration: > > > > Target iqn.2001-04.com.example:storage.disk2.sys1.xyzz > > Lun 0 Type=nullio > > > > A windows client initiator logon this target and use iometer to test it, > > While iometer is reading, this cpu load is observed on the target server: > > > > gektop@tux ~ $ dstat -M cpu,net 5 > > ----total-cpu-usage---- -net/total- > > usr sys idl wai hiq siq| recv send > > 30 4 66 0 0 0| 0 0 > > 23 41 0 0 17 19| 246k 10M > > 22 36 0 0 21 21| 264k 11M > > 24 38 0 0 17 21| 261k 11M > > 24 36 0 0 21 19| 268k 11M > > > > on the desktop computer the only one nic is a 100Mb network adapter, > > so the network is near the limit, > > > > but the problem is: why it consumes so high CPU? > > > > on another server hardware with the same ietd.conf and dual-core CPU > > and 1000Mb nic, it was observed that "sys,hiq,siq" consumes all the > > CPU resources, too. > > > > with systemtap debugging, I found the sendpage (really tcp_sendpage) > > would very likely return -EAGAIN, I think this loop consumes so high > > CPU, > > > > but I have done a simple test, sendpage without MSG_DONTWAIT, in this > > situation, the target consumed very low CPU resources, average under > > 1%, with the same nullio performance, 110MB, nearly the 1000Mb limit, > > > > I'm not very clear about the meangings of tcp_sendpage with or without > > MSG_DONTWAIT, so please review this patch: > > Can you give more details about your H/W, in particular how much RAM the > target has and whether you're running x86 or x86_64 (kernel)? > > As you observed, MSG_DONTWAIT will lead to tcp_sendpage() returning > errors if it cannot get hold of memory for data transmission. Removing > this flag will make tcp_sendpage() try a bit harder. ... at the expense of taking longer (sleeping), I forgot to add. You might also want to play with your tcp wmem settings and see if that improves the situation. Arne > > I'll take a closer look at the implications of your patch, but I'd also > like to get some more data points before making such modifications - > anyone else willing to repeat the above tests? > > Thanks, > Arne > > > Index: iscsitarget-r168/kernel/nthread.c > > =================================================================== > > --- iscsitarget-r168.orig/kernel/nthread.c > > +++ iscsitarget-r168/kernel/nthread.c > > @@ -294,7 +294,7 @@ static int write_data(struct iscsi_conn > > struct iovec *iop; > > int saved_size, size, sendsize; > > int offset, idx; > > - int flags, res; > > + int res; > > > > file = conn->file; > > saved_size = size = conn->write_size; > > @@ -351,12 +351,11 @@ static int write_data(struct iscsi_conn > > > > sock = conn->sock; > > sendpage = sock->ops->sendpage ? : sock_no_sendpage; > > - flags = MSG_DONTWAIT; > > > > while (1) { > > sendsize = PAGE_CACHE_SIZE - offset; > > if (size <= sendsize) { > > - res = sendpage(sock, tio->pvec[idx], offset, size, flags); > > + res = sendpage(sock, tio->pvec[idx], offset, size, 0); > > dprintk(D_DATA, "%s %#Lx:%u: %d(%lu,%u,%u)\n", > > sock->ops->sendpage ? "sendpage" : "writepage", > > (unsigned long long ) conn->session->sid, conn->cid, > > @@ -377,7 +376,7 @@ static int write_data(struct iscsi_conn > > continue; > > } > > > > - res = sendpage(sock, tio->pvec[idx], offset,sendsize, flags | MSG_MORE); > > + res = sendpage(sock, tio->pvec[idx], offset,sendsize, MSG_MORE); > > dprintk(D_DATA, "%s %#Lx:%u: %d(%lu,%u,%u)\n", > > sock->ops->sendpage ? "sendpage" : "writepage", > > (unsigned long long ) conn->session->sid, conn->cid, > > > > > > Thanks. > > > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Iscsitarget-devel mailing list > Isc...@li... > https://lists.sourceforge.net/lists/listinfo/iscsitarget-devel > |
From: rae l <cr...@gm...> - 2008-08-29 18:46:10
|
On Fri, Aug 29, 2008 at 11:55 PM, Arne Redlich <ag...@po...> wrote: > Can you give more details about your H/W, in particular how much RAM the > target has and whether you're running x86 or x86_64 (kernel)? This "read iscsi consume high CPU" phenomenon is always reproducible in our laboratory on serveral types of HW and SW: 1. x86 kernel 2.6.22.16 with iscsitarget-0.4.16, 2GB RAM; 2. x86_64 kernel 2.6.26.3 with iscsitarget-svn-r168, 4GB RAM; 3. destkop computer with x86_64 2.6.26.3, iscsitarget-r168, 1GB RAM; > > As you observed, MSG_DONTWAIT will lead to tcp_sendpage() returning > errors if it cannot get hold of memory for data transmission. Removing > this flag will make tcp_sendpage() try a bit harder. On Sat, Aug 30, 2008 at 12:18 AM, Arne Redlich <ag...@po...> wrote: > > ... at the expense of taking longer (sleeping), I forgot to add. > > You might also want to play with your tcp wmem settings and see if that > improves the situation. tcp mem is set as the recommended value as in init scripts (/etc/init.d/ietd) # sysctl -a |grep 'net.*mem' net.ipv4.tcp_mem = 1048576 1048576 1048576 net.ipv4.tcp_wmem = 1048576 1048576 2056192 net.ipv4.tcp_rmem = 1048576 1048576 2056192 net.core.wmem_max = 1048576 net.core.rmem_max = 1048576 net.core.wmem_default = 1048576 net.core.rmem_default = 1048576 all 1048576. We have tried with 2MB or more, but that seems do not help. > > I'll take a closer look at the implications of your patch, but I'd also > like to get some more data points before making such modifications - > anyone else willing to repeat the above tests? This patch is very simple, just drop MSG_DONTWAIT when calling tcp_sendpage, the flags variable is then not used, so removed, too. > > Thanks, > Arne > >> Index: iscsitarget-r168/kernel/nthread.c >> =================================================================== >> --- iscsitarget-r168.orig/kernel/nthread.c >> +++ iscsitarget-r168/kernel/nthread.c >> @@ -294,7 +294,7 @@ static int write_data(struct iscsi_conn >> struct iovec *iop; >> int saved_size, size, sendsize; >> int offset, idx; >> - int flags, res; >> + int res; >> >> file = conn->file; >> saved_size = size = conn->write_size; >> @@ -351,12 +351,11 @@ static int write_data(struct iscsi_conn >> >> sock = conn->sock; >> sendpage = sock->ops->sendpage ? : sock_no_sendpage; >> - flags = MSG_DONTWAIT; >> >> while (1) { >> sendsize = PAGE_CACHE_SIZE - offset; >> if (size <= sendsize) { >> - res = sendpage(sock, tio->pvec[idx], offset, size, flags); >> + res = sendpage(sock, tio->pvec[idx], offset, size, 0); >> dprintk(D_DATA, "%s %#Lx:%u: %d(%lu,%u,%u)\n", >> sock->ops->sendpage ? "sendpage" : "writepage", >> (unsigned long long ) conn->session->sid, conn->cid, >> @@ -377,7 +376,7 @@ static int write_data(struct iscsi_conn >> continue; >> } >> >> - res = sendpage(sock, tio->pvec[idx], offset,sendsize, flags | MSG_MORE); >> + res = sendpage(sock, tio->pvec[idx], offset,sendsize, MSG_MORE); >> dprintk(D_DATA, "%s %#Lx:%u: %d(%lu,%u,%u)\n", >> sock->ops->sendpage ? "sendpage" : "writepage", >> (unsigned long long ) conn->session->sid, conn->cid, |
From: Ross S. W. W. <RW...@me...> - 2008-08-29 18:57:22
|
rae l wrote: > > On Fri, Aug 29, 2008 at 11:55 PM, Arne Redlich > <ag...@po...> wrote: > > Can you give more details about your H/W, in particular how much RAM the > > target has and whether you're running x86 or x86_64 (kernel)? > > This "read iscsi consume high CPU" phenomenon is always reproducible > in our laboratory on serveral types of HW and SW: > 1. x86 kernel 2.6.22.16 with iscsitarget-0.4.16, 2GB RAM; > 2. x86_64 kernel 2.6.26.3 with iscsitarget-svn-r168, 4GB RAM; > 3. destkop computer with x86_64 2.6.26.3, iscsitarget-r168, 1GB RAM; > Which distro? Are these hosts only running IET at the time? What is the config of the nullio luns? > > > > As you observed, MSG_DONTWAIT will lead to tcp_sendpage() returning > > errors if it cannot get hold of memory for data transmission. Removing > > this flag will make tcp_sendpage() try a bit harder. > > > > ... at the expense of taking longer (sleeping), I forgot to add. > > > > You might also want to play with your tcp wmem settings and see if that > > improves the situation. > > tcp mem is set as the recommended value as in init scripts > (/etc/init.d/ietd) > > # sysctl -a |grep 'net.*mem' > net.ipv4.tcp_mem = 1048576 1048576 1048576 > net.ipv4.tcp_wmem = 1048576 1048576 2056192 > net.ipv4.tcp_rmem = 1048576 1048576 2056192 > net.core.wmem_max = 1048576 > net.core.rmem_max = 1048576 > net.core.wmem_default = 1048576 > net.core.rmem_default = 1048576 > > all 1048576. > > We have tried with 2MB or more, but that seems do not help. We actually don't suggest pre-setting these at all any more. The 2.6 kernel's TCP stack has under gone major changes since these values were first proposed and now we suggest you let your stack self-tune to the proper running values, so remove the sysctl settings from the init scripts and let us know if anything changes. > > > > I'll take a closer look at the implications of your patch, but I'd also > > like to get some more data points before making such modifications - > > anyone else willing to repeat the above tests? > > This patch is very simple, just drop MSG_DONTWAIT when calling tcp_sendpage, > the flags variable is then not used, so removed, too. > I haven't seen this happen with nullio on the RHEL 2.6.18 kernels, but I can try a newer kernel over the weekend and see if I can reproduce it on FC8 or FC9. -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |
From: rae l <cr...@gm...> - 2008-08-31 14:44:51
|
On Sat, Aug 30, 2008 at 2:57 AM, Ross S. W. Walker <RW...@me...> wrote: > Which distro? Distro not important here, in fact serveral distros tested: 1. Gentoo 2008.0 x64 2. CentOS-5.1 x64 3. Our production server, a completely self built system from scratch (LFS similar); we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, > > Are these hosts only running IET at the time? Yes, when we were testing iSCSI functionality, only ietd server is running on the server, (of course with sshd and so on, those proved no problem) > > What is the config of the nullio luns? In the first mail: A simple nullio configuration: Target iqn.2001-04.com.example:storage.disk2.sys1.xyzz Lun 0 Type=nullio All the thing originates: 1. create a software RAID5 (md) with 11 SATA drives, the bandwidth is 400+ MB/s; 2. create LVM on RAID5; 3. iscsi target with Lun 0 Type=fileio or blockio, Path=/dev/vg1/lv1 4. iscsi initiator logon this target and read its content; then 100% CPU consuming observed on ther target server, if the client stop to read, the target server's load will fall to 99% idle immediately. so the problem must be from ietd and its kernel module, right? > We actually don't suggest pre-setting these at all any more. The 2.6 > kernel's TCP stack has under gone major changes since these values > were first proposed and now we suggest you let your stack self-tune > to the proper running values, so remove the sysctl settings from > the init scripts and let us know if anything changes. This is also tested, but that doesn't help, we have tried all kinds of tcp mem configure, including no tcp mem sysctl configuration. > >> > >> > I'll take a closer look at the implications of your patch, but I'd also >> > like to get some more data points before making such modifications - >> > anyone else willing to repeat the above tests? >> >> This patch is very simple, just drop MSG_DONTWAIT when calling tcp_sendpage, >> the flags variable is then not used, so removed, too. >> > > I haven't seen this happen with nullio on the RHEL 2.6.18 kernels, but > I can try a newer kernel over the weekend and see if I can reproduce > it on FC8 or FC9. On Sat, Aug 30, 2008 at 3:03 AM, Arne Redlich <ag...@po...> wrote: > It _looks_ simple, but the implications of it aren't: The function is > called from the network thread (there is one per target), so if > tcp_sendpage sends the thread to sleep because a connection's socket has > no buffers, all other connections to this target also have to wait. With > MSG_DONTWAIT set, another connection could be served meanwhile. > > So this needs to be adressed differently. You mean there will be problems with multiple clients if dropping MSG_DONTWAIT? I'll test that. > > HTH, > Arne -- Denis ChengRq Linux Application Developer "One of my most productive days was throwing away 1000 lines of code." - Ken Thompson. |
From: Arne R. <ag...@po...> - 2008-09-01 05:44:15
|
Am Sonntag, den 31.08.2008, 22:45 +0800 schrieb rae l: > On Sat, Aug 30, 2008 at 2:57 AM, Ross S. W. Walker > <RW...@me...> wrote: > > Which distro? > Distro not important here, in fact serveral distros tested: > 1. Gentoo 2008.0 x64 > 2. CentOS-5.1 x64 > 3. Our production server, a completely self built system from scratch > (LFS similar); > > we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, > > > > > Are these hosts only running IET at the time? > Yes, when we were testing iSCSI functionality, only ietd server is > running on the server, (of course with sshd and so on, those proved no > problem) > > > > > What is the config of the nullio luns? > In the first mail: > > A simple nullio configuration: > > Target iqn.2001-04.com.example:storage.disk2.sys1.xyzz > Lun 0 Type=nullio > > All the thing originates: > 1. create a software RAID5 (md) with 11 SATA drives, the bandwidth is 400+ MB/s; That's a bit oversized, no? Unless you have a very specific application or a very small chunk size you're very unlikely to write full stripes with that many disks, leading to read-modify-write degrading your performance. And I'd also be a bit nervous having only a single redundant disk at this size. So you're also seeing the 100% CPU load with this setup? Strange. Did you test your components individually, e.g. what happens if you perform I/O locally on the DM/MD devices? Which results does netperf yield between your target and initiator boxes? > >> > >> This patch is very simple, just drop MSG_DONTWAIT when calling tcp_sendpage, > >> the flags variable is then not used, so removed, too. > >> > > > > I haven't seen this happen with nullio on the RHEL 2.6.18 kernels, but > > I can try a newer kernel over the weekend and see if I can reproduce > > it on FC8 or FC9. Thanks Ross, I cannot test it myself at the moment. Please keep me updated. > On Sat, Aug 30, 2008 at 3:03 AM, Arne Redlich <ag...@po...> wrote: > > It _looks_ simple, but the implications of it aren't: The function is > > called from the network thread (there is one per target), so if > > tcp_sendpage sends the thread to sleep because a connection's socket has > > no buffers, all other connections to this target also have to wait. With > > MSG_DONTWAIT set, another connection could be served meanwhile. > > > > So this needs to be adressed differently. > You mean there will be problems with multiple clients if dropping MSG_DONTWAIT? Yes. Cheers, Arne |
From: Arne R. <ag...@po...> - 2008-08-29 19:03:00
|
Am Samstag, den 30.08.2008, 02:46 +0800 schrieb rae l: > On Fri, Aug 29, 2008 at 11:55 PM, Arne Redlich <ag...@po...> wrote: > > Can you give more details about your H/W, in particular how much RAM the > > target has and whether you're running x86 or x86_64 (kernel)? > This "read iscsi consume high CPU" phenomenon is always reproducible > in our laboratory on serveral types of HW and SW: > 1. x86 kernel 2.6.22.16 with iscsitarget-0.4.16, 2GB RAM; > 2. x86_64 kernel 2.6.26.3 with iscsitarget-svn-r168, 4GB RAM; > 3. destkop computer with x86_64 2.6.26.3, iscsitarget-r168, 1GB RAM; > > > > > As you observed, MSG_DONTWAIT will lead to tcp_sendpage() returning > > errors if it cannot get hold of memory for data transmission. Removing > > this flag will make tcp_sendpage() try a bit harder. > > On Sat, Aug 30, 2008 at 12:18 AM, Arne Redlich <ag...@po...> wrote: > > > > ... at the expense of taking longer (sleeping), I forgot to add. > > > > You might also want to play with your tcp wmem settings and see if that > > improves the situation. > tcp mem is set as the recommended value as in init scripts (/etc/init.d/ietd) > > # sysctl -a |grep 'net.*mem' > net.ipv4.tcp_mem = 1048576 1048576 1048576 > net.ipv4.tcp_wmem = 1048576 1048576 2056192 > net.ipv4.tcp_rmem = 1048576 1048576 2056192 > net.core.wmem_max = 1048576 > net.core.rmem_max = 1048576 > net.core.wmem_default = 1048576 > net.core.rmem_default = 1048576 > > all 1048576. > > We have tried with 2MB or more, but that seems do not help. > > > > > I'll take a closer look at the implications of your patch, but I'd also > > like to get some more data points before making such modifications - > > anyone else willing to repeat the above tests? > This patch is very simple, just drop MSG_DONTWAIT when calling tcp_sendpage, > the flags variable is then not used, so removed, too. It _looks_ simple, but the implications of it aren't: The function is called from the network thread (there is one per target), so if tcp_sendpage sends the thread to sleep because a connection's socket has no buffers, all other connections to this target also have to wait. With MSG_DONTWAIT set, another connection could be served meanwhile. So this needs to be adressed differently. HTH, Arne |
From: Arne R. <ag...@po...> - 2008-09-01 05:55:31
|
Am Montag, den 01.09.2008, 07:44 +0200 schrieb Arne Redlich: > Am Sonntag, den 31.08.2008, 22:45 +0800 schrieb rae l: > > On Sat, Aug 30, 2008 at 2:57 AM, Ross S. W. Walker > > <RW...@me...> wrote: > > > Which distro? > > Distro not important here, in fact serveral distros tested: > > 1. Gentoo 2008.0 x64 > > 2. CentOS-5.1 x64 > > 3. Our production server, a completely self built system from scratch > > (LFS similar); > > > > we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, Forgot to ask: what NICs are you using? Thanks, Arne |
From: Ross S. W. W. <RW...@me...> - 2008-09-01 22:03:46
|
Arne Redlich wrote: > Am Montag, den 01.09.2008, 07:44 +0200 schrieb Arne Redlich: > > Am Sonntag, den 31.08.2008, 22:45 +0800 schrieb rae l: > > > On Sat, Aug 30, 2008 at 2:57 AM, Ross S. W. Walker > > > <RW...@me...> wrote: > > > > Which distro? > > > Distro not important here, in fact serveral distros tested: > > > 1. Gentoo 2008.0 x64 > > > 2. CentOS-5.1 x64 > > > 3. Our production server, a completely self built system from scratch > > > (LFS similar); > > > > > > we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, > > Forgot to ask: what NICs are you using? Yes, yes, I just remember reading on one of the lists CentOS or Xen where a poster had high CPU, and even CPU usage when there was no activity on some NIC, can't remember which, I'll look it up. Anyways the answer was to upgrade the NIC driver from the stock kernel driver to the manufacturers posted driver. If you can try that on one of your distros just to see if that is the fix and let us know the make/model of the card. nullio really only stresses out the NIC during usage, so it makes sense that that should be one area to concentrate on. -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |
From: Ross S. W. W. <RW...@me...> - 2008-09-01 22:12:14
|
Ross S. W. Walker wrote: > Arne Redlich wrote: > > Am Montag, den 01.09.2008, 07:44 +0200 schrieb Arne Redlich: > > > Am Sonntag, den 31.08.2008, 22:45 +0800 schrieb rae l: > > > > On Sat, Aug 30, 2008 at 2:57 AM, Ross S. W. Walker > > > > <RW...@me...> wrote: > > > > > Which distro? > > > > Distro not important here, in fact serveral distros tested: > > > > 1. Gentoo 2008.0 x64 > > > > 2. CentOS-5.1 x64 > > > > 3. Our production server, a completely self built system from scratch > > > > (LFS similar); > > > > > > > > we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, > > > > Forgot to ask: what NICs are you using? > > Yes, yes, I just remember reading on one of the lists CentOS or Xen where > a poster had high CPU, and even CPU usage when there was no activity on > some NIC, can't remember which, I'll look it up. Anyways the answer was > to upgrade the NIC driver from the stock kernel driver to the manufacturers > posted driver. > > If you can try that on one of your distros just to see if that is the > fix and let us know the make/model of the card. > > nullio really only stresses out the NIC during usage, so it makes sense > that that should be one area to concentrate on. The thread I was thinking of: http://lists.centos.org/pipermail/centos/2008-July/061249.html -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |
From: rae l <cr...@gm...> - 2008-09-02 01:57:57
|
On Mon, Sep 1, 2008 at 1:55 PM, Arne Redlich <ag...@po...> wrote: >> > we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, > > Forgot to ask: what NICs are you using? Two types of NICs we have tested: 04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) (rev 01) Subsystem: Inventec Corporation Device 0023 Flags: bus master, fast devsel, latency 0, IRQ 381 Memory at fcdc0000 (32-bit, non-prefetchable) [size=128K] I/O ports at ce80 [size=32] Capabilities: [c8] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+ Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting <?> Capabilities: [140] Device Serial Number 5c-fd-e7-ff-ff-d1-a0-00 Kernel driver in use: e1000 02:09.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02) Subsystem: Dell Device 01e5 Flags: bus master, fast devsel, latency 64, IRQ 10 Memory at dfcfe000 (32-bit, non-prefetchable) [size=8K] Capabilities: <access denied> Kernel driver in use: b44 Kernel modules: b44 One gigabit and one megabit, both have high CPU consuming when iSCSI reading. > > Thanks, > Arne On Tue, Sep 2, 2008 at 6:03 AM, Ross S. W. Walker <RW...@me...> wrote: > Yes, yes, I just remember reading on one of the lists CentOS or Xen where > a poster had high CPU, and even CPU usage when there was no activity on > some NIC, can't remember which, I'll look it up. Anyways the answer was > to upgrade the NIC driver from the stock kernel driver to the manufacturers > posted driver. But I don't think the NIC drivers have problems: 1. All other network applications (NFS and Samba and http) performs well, 110MB/s on gigabit and 12MB/s on megabit, all near the limit of the NIC. > > If you can try that on one of your distros just to see if that is the > fix and let us know the make/model of the card. > > nullio really only stresses out the NIC during usage, so it makes sense > that that should be one area to concentrate on. > > -Ross On Tue, Sep 2, 2008 at 6:12 AM, Ross S. W. Walker <RW...@me...> wrote: > The thread I was thinking of: > > http://lists.centos.org/pipermail/centos/2008-July/061249.html The difference with this: 1. in our scenario, if iSCSI initiators don't read, the target CPU fall to idle 99%, immediately; if initiators begin to read, target CPU will climb up to 100% (sys+hiq+siq), all immediately; |
From: Ross S. W. W. <RW...@me...> - 2008-09-02 18:24:10
|
rae l wrote: > On Mon, Sep 1, 2008 at 1:55 PM, Arne Redlich > <ag...@po...> wrote: > >> > we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, > > > > Forgot to ask: what NICs are you using? > > Two types of NICs we have tested: > > 04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit > Ethernet Controller (Copper) (rev 01) > Subsystem: Inventec Corporation Device 0023 > Flags: bus master, fast devsel, latency 0, IRQ 381 > Memory at fcdc0000 (32-bit, non-prefetchable) [size=128K] > I/O ports at ce80 [size=32] > Capabilities: [c8] Power Management version 2 > Capabilities: [d0] Message Signalled Interrupts: Mask- > 64bit+ Queue=0/0 Enable+ > Capabilities: [e0] Express Endpoint, MSI 00 > Capabilities: [100] Advanced Error Reporting <?> > Capabilities: [140] Device Serial Number 5c-fd-e7-ff-ff-d1-a0-00 > Kernel driver in use: e1000 > > 02:09.0 Ethernet controller: Broadcom Corporation BCM4401-B0 > 100Base-TX (rev 02) > Subsystem: Dell Device 01e5 > Flags: bus master, fast devsel, latency 64, IRQ 10 > Memory at dfcfe000 (32-bit, non-prefetchable) [size=8K] > Capabilities: <access denied> > Kernel driver in use: b44 > Kernel modules: b44 > > One gigabit and one megabit, both have high CPU consuming > when iSCSI reading. I haven't been able to reproduce this yet (haven't had access to the test machine). I can tell you though that for nullio targets all write data is discarded and all read data is uninitialized memory, or basically random data. If you see high load on reading and not writing I would check the transmit path of the server. Comparing NFS/CIFS and iSCSI usage of the network adapter isn't quite the same. Can you give us a 'modinfo e1000' and a 'modinfo b44'? Can you also run a 'vmstat 1' during the high CPU usage and send us a screen's worth. How are the initiators configured? What is the MRDSL set to? -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |
From: Ming Z. <bla...@gm...> - 2008-09-02 18:26:37
|
my 2c, a 5 second tcpdump from login to read might reveal a lot of things. ;) On Tue, 2008-09-02 at 14:24 -0400, Ross S. W. Walker wrote: > rae l wrote: > > On Mon, Sep 1, 2008 at 1:55 PM, Arne Redlich > > <ag...@po...> wrote: > > >> > we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, > > > > > > Forgot to ask: what NICs are you using? > > > > Two types of NICs we have tested: > > > > 04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit > > Ethernet Controller (Copper) (rev 01) > > Subsystem: Inventec Corporation Device 0023 > > Flags: bus master, fast devsel, latency 0, IRQ 381 > > Memory at fcdc0000 (32-bit, non-prefetchable) [size=128K] > > I/O ports at ce80 [size=32] > > Capabilities: [c8] Power Management version 2 > > Capabilities: [d0] Message Signalled Interrupts: Mask- > > 64bit+ Queue=0/0 Enable+ > > Capabilities: [e0] Express Endpoint, MSI 00 > > Capabilities: [100] Advanced Error Reporting <?> > > Capabilities: [140] Device Serial Number 5c-fd-e7-ff-ff-d1-a0-00 > > Kernel driver in use: e1000 > > > > 02:09.0 Ethernet controller: Broadcom Corporation BCM4401-B0 > > 100Base-TX (rev 02) > > Subsystem: Dell Device 01e5 > > Flags: bus master, fast devsel, latency 64, IRQ 10 > > Memory at dfcfe000 (32-bit, non-prefetchable) [size=8K] > > Capabilities: <access denied> > > Kernel driver in use: b44 > > Kernel modules: b44 > > > > One gigabit and one megabit, both have high CPU consuming > > when iSCSI reading. > > I haven't been able to reproduce this yet (haven't had access to > the test machine). > > I can tell you though that for nullio targets all write data > is discarded and all read data is uninitialized memory, or > basically random data. > > If you see high load on reading and not writing I would > check the transmit path of the server. Comparing NFS/CIFS > and iSCSI usage of the network adapter isn't quite the > same. > > Can you give us a 'modinfo e1000' and a 'modinfo b44'? > > Can you also run a 'vmstat 1' during the high CPU usage > and send us a screen's worth. > > How are the initiators configured? What is the MRDSL set > to? > > -Ross > > ______________________________________________________________________ > This e-mail, and any attachments thereto, is intended only for use by > the addressee(s) named herein and may contain legally privileged > and/or confidential information. If you are not the intended recipient > of this e-mail, you are hereby notified that any dissemination, > distribution or copying of this e-mail, and any attachments thereto, > is strictly prohibited. If you have received this e-mail in error, > please immediately notify the sender and permanently delete the > original and any copy or printout thereof. > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Iscsitarget-devel mailing list > Isc...@li... > https://lists.sourceforge.net/lists/listinfo/iscsitarget-devel |
From: Ross S. W. W. <RW...@me...> - 2008-09-02 19:02:47
|
Ming Zhang wrote: > > my 2c, a 5 second tcpdump from login to read might reveal a lot of > things. ;) > Yes, specifically what segment sizes were actually negotiated on, but also the TCP mss too. If the MRDSL negotiated was too small, say 256 bytes like another poster accidentally set, then the CPU might be dying by interrupt load, which a vmstat can also point out. Ok, so starting with a system with no initiators connected, do a: # tcpdump -c 750 -i <interface> -w iscsi.dmp tcp port 3260 Connect with 1 initiator starting a read right away. Then send the output to me, compressed if >100K and I'll take a look at it. -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |
From: Arne R. <ag...@po...> - 2008-09-02 18:55:10
|
Am Dienstag, den 02.09.2008, 14:26 -0400 schrieb Ming Zhang: > my 2c, a 5 second tcpdump from login to read might reveal a lot of > things. ;) Splendid idea, Ming :) Denis, could you provide one? Thanks, Arne > > On Tue, 2008-09-02 at 14:24 -0400, Ross S. W. Walker wrote: > > rae l wrote: > > > On Mon, Sep 1, 2008 at 1:55 PM, Arne Redlich > > > <ag...@po...> wrote: > > > >> > we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, > > > > > > > > Forgot to ask: what NICs are you using? > > > > > > Two types of NICs we have tested: > > > > > > 04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit > > > Ethernet Controller (Copper) (rev 01) > > > Subsystem: Inventec Corporation Device 0023 > > > Flags: bus master, fast devsel, latency 0, IRQ 381 > > > Memory at fcdc0000 (32-bit, non-prefetchable) [size=128K] > > > I/O ports at ce80 [size=32] > > > Capabilities: [c8] Power Management version 2 > > > Capabilities: [d0] Message Signalled Interrupts: Mask- > > > 64bit+ Queue=0/0 Enable+ > > > Capabilities: [e0] Express Endpoint, MSI 00 > > > Capabilities: [100] Advanced Error Reporting <?> > > > Capabilities: [140] Device Serial Number 5c-fd-e7-ff-ff-d1-a0-00 > > > Kernel driver in use: e1000 > > > > > > 02:09.0 Ethernet controller: Broadcom Corporation BCM4401-B0 > > > 100Base-TX (rev 02) > > > Subsystem: Dell Device 01e5 > > > Flags: bus master, fast devsel, latency 64, IRQ 10 > > > Memory at dfcfe000 (32-bit, non-prefetchable) [size=8K] > > > Capabilities: <access denied> > > > Kernel driver in use: b44 > > > Kernel modules: b44 > > > > > > One gigabit and one megabit, both have high CPU consuming > > > when iSCSI reading. > > > > I haven't been able to reproduce this yet (haven't had access to > > the test machine). > > > > I can tell you though that for nullio targets all write data > > is discarded and all read data is uninitialized memory, or > > basically random data. > > > > If you see high load on reading and not writing I would > > check the transmit path of the server. Comparing NFS/CIFS > > and iSCSI usage of the network adapter isn't quite the > > same. > > > > Can you give us a 'modinfo e1000' and a 'modinfo b44'? > > > > Can you also run a 'vmstat 1' during the high CPU usage > > and send us a screen's worth. > > > > How are the initiators configured? What is the MRDSL set > > to? > > > > -Ross > > > > ______________________________________________________________________ > > This e-mail, and any attachments thereto, is intended only for use by > > the addressee(s) named herein and may contain legally privileged > > and/or confidential information. If you are not the intended recipient > > of this e-mail, you are hereby notified that any dissemination, > > distribution or copying of this e-mail, and any attachments thereto, > > is strictly prohibited. If you have received this e-mail in error, > > please immediately notify the sender and permanently delete the > > original and any copy or printout thereof. > > > > > > ------------------------------------------------------------------------- > > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > > Build the coolest Linux based applications with Moblin SDK & win great prizes > > Grand prize is a trip for two to an Open Source event anywhere in the world > > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > > Iscsitarget-devel mailing list > > Isc...@li... > > https://lists.sourceforge.net/lists/listinfo/iscsitarget-devel > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Iscsitarget-devel mailing list > Isc...@li... > https://lists.sourceforge.net/lists/listinfo/iscsitarget-devel > |
From: Ming Z. <bla...@gm...> - 2008-09-02 19:02:36
|
On Tue, 2008-09-02 at 20:55 +0200, Arne Redlich wrote: > Am Dienstag, den 02.09.2008, 14:26 -0400 schrieb Ming Zhang: > > my 2c, a 5 second tcpdump from login to read might reveal a lot of > > things. ;) > > Splendid idea, Ming :) thanks. also i know a lot of the company technical support group have scripts that can collect various information from servers without rounds and rounds of emails. can we have one? including these can save a lot of time, i believe... uname -a cat /proc/cpuinfo cat /proc/meminfo cat /proc/net/iet/* lsmod lspci dmesg lsscsi ethtool ... cat /etc/ietd.conf ... > > Denis, could you provide one? > > Thanks, > Arne > > > > On Tue, 2008-09-02 at 14:24 -0400, Ross S. W. Walker wrote: > > > rae l wrote: > > > > On Mon, Sep 1, 2008 at 1:55 PM, Arne Redlich > > > > <ag...@po...> wrote: > > > > >> > we tested 2.6.22-26, x86-32 and x86-64, not the default distro kernel, > > > > > > > > > > Forgot to ask: what NICs are you using? > > > > > > > > Two types of NICs we have tested: > > > > > > > > 04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit > > > > Ethernet Controller (Copper) (rev 01) > > > > Subsystem: Inventec Corporation Device 0023 > > > > Flags: bus master, fast devsel, latency 0, IRQ 381 > > > > Memory at fcdc0000 (32-bit, non-prefetchable) [size=128K] > > > > I/O ports at ce80 [size=32] > > > > Capabilities: [c8] Power Management version 2 > > > > Capabilities: [d0] Message Signalled Interrupts: Mask- > > > > 64bit+ Queue=0/0 Enable+ > > > > Capabilities: [e0] Express Endpoint, MSI 00 > > > > Capabilities: [100] Advanced Error Reporting <?> > > > > Capabilities: [140] Device Serial Number 5c-fd-e7-ff-ff-d1-a0-00 > > > > Kernel driver in use: e1000 > > > > > > > > 02:09.0 Ethernet controller: Broadcom Corporation BCM4401-B0 > > > > 100Base-TX (rev 02) > > > > Subsystem: Dell Device 01e5 > > > > Flags: bus master, fast devsel, latency 64, IRQ 10 > > > > Memory at dfcfe000 (32-bit, non-prefetchable) [size=8K] > > > > Capabilities: <access denied> > > > > Kernel driver in use: b44 > > > > Kernel modules: b44 > > > > > > > > One gigabit and one megabit, both have high CPU consuming > > > > when iSCSI reading. > > > > > > I haven't been able to reproduce this yet (haven't had access to > > > the test machine). > > > > > > I can tell you though that for nullio targets all write data > > > is discarded and all read data is uninitialized memory, or > > > basically random data. > > > > > > If you see high load on reading and not writing I would > > > check the transmit path of the server. Comparing NFS/CIFS > > > and iSCSI usage of the network adapter isn't quite the > > > same. > > > > > > Can you give us a 'modinfo e1000' and a 'modinfo b44'? > > > > > > Can you also run a 'vmstat 1' during the high CPU usage > > > and send us a screen's worth. > > > > > > How are the initiators configured? What is the MRDSL set > > > to? > > > > > > -Ross > > > > > > ______________________________________________________________________ > > > This e-mail, and any attachments thereto, is intended only for use by > > > the addressee(s) named herein and may contain legally privileged > > > and/or confidential information. If you are not the intended recipient > > > of this e-mail, you are hereby notified that any dissemination, > > > distribution or copying of this e-mail, and any attachments thereto, > > > is strictly prohibited. If you have received this e-mail in error, > > > please immediately notify the sender and permanently delete the > > > original and any copy or printout thereof. > > > > > > > > > ------------------------------------------------------------------------- > > > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > > > Build the coolest Linux based applications with Moblin SDK & win great prizes > > > Grand prize is a trip for two to an Open Source event anywhere in the world > > > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > > _______________________________________________ > > > Iscsitarget-devel mailing list > > > Isc...@li... > > > https://lists.sourceforge.net/lists/listinfo/iscsitarget-devel > > > > > > ------------------------------------------------------------------------- > > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > > Build the coolest Linux based applications with Moblin SDK & win great prizes > > Grand prize is a trip for two to an Open Source event anywhere in the world > > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > > Iscsitarget-devel mailing list > > Isc...@li... > > https://lists.sourceforge.net/lists/listinfo/iscsitarget-devel > > > |
From: Ross S. W. W. <RW...@me...> - 2008-09-02 19:06:16
|
Ming Zhang wrote: > On Tue, 2008-09-02 at 20:55 +0200, Arne Redlich wrote: > > Am Dienstag, den 02.09.2008, 14:26 -0400 schrieb Ming Zhang: > > > my 2c, a 5 second tcpdump from login to read might reveal a lot of > > > things. ;) > > > > Splendid idea, Ming :) > > thanks. > > also i know a lot of the company technical support group have scripts > that can collect various information from servers without rounds and > rounds of emails. can we have one? > > including these can save a lot of time, i believe... > > uname -a > cat /proc/cpuinfo > cat /proc/meminfo > cat /proc/net/iet/* > lsmod > lspci > dmesg > lsscsi > ethtool ... > cat /etc/ietd.conf > ... Excellent, can you throw one together for inclusion into the code? Call it ietdiag or something of that ilk. -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |
From: Ming Z. <bla...@gm...> - 2008-09-02 19:11:34
|
On Tue, 2008-09-02 at 15:06 -0400, Ross S. W. Walker wrote: > Ming Zhang wrote: > > On Tue, 2008-09-02 at 20:55 +0200, Arne Redlich wrote: > > > Am Dienstag, den 02.09.2008, 14:26 -0400 schrieb Ming Zhang: > > > > my 2c, a 5 second tcpdump from login to read might reveal a lot of > > > > things. ;) > > > > > > Splendid idea, Ming :) > > > > thanks. > > > > also i know a lot of the company technical support group have scripts > > that can collect various information from servers without rounds and > > rounds of emails. can we have one? > > > > including these can save a lot of time, i believe... > > > > uname -a > > cat /proc/cpuinfo > > cat /proc/meminfo > > cat /proc/net/iet/* > > lsmod > > lspci > > dmesg > > lsscsi > > ethtool ... > > cat /etc/ietd.conf > > ... > > Excellent, can you throw one together for inclusion into the code? sorry, i do not think i have the time... i am just in a boring meeting and replying email has more fun... http://solutions.qlogic.com/KanisaSupportSite/search.do?cmd=displayKC&docType=kc&externalId=10726&sliceId=&dialogID=17509767&stateId=0%200%2017497248 should be a good start for you... > > Call it ietdiag or something of that ilk. > > -Ross > > ______________________________________________________________________ > This e-mail, and any attachments thereto, is intended only for use by > the addressee(s) named herein and may contain legally privileged > and/or confidential information. If you are not the intended recipient > of this e-mail, you are hereby notified that any dissemination, > distribution or copying of this e-mail, and any attachments thereto, > is strictly prohibited. If you have received this e-mail in error, > please immediately notify the sender and permanently delete the > original and any copy or printout thereof. > |
From: rae l <cr...@gm...> - 2008-09-03 03:20:12
Attachments:
iscsi.dump
ietddiag.info
|
On Wed, Sep 3, 2008 at 3:02 AM, Ming Zhang <bla...@gm...> wrote: > > On Tue, 2008-09-02 at 20:55 +0200, Arne Redlich wrote: >> Am Dienstag, den 02.09.2008, 14:26 -0400 schrieb Ming Zhang: >> > my 2c, a 5 second tcpdump from login to read might reveal a lot of >> > things. ;) >> >> Splendid idea, Ming :) > > thanks. > > also i know a lot of the company technical support group have scripts > that can collect various information from servers without rounds and > rounds of emails. can we have one? > > including these can save a lot of time, i believe... > > uname -a > cat /proc/cpuinfo > cat /proc/meminfo > cat /proc/net/iet/* > lsmod > lspci > dmesg > lsscsi > ethtool ... > cat /etc/ietd.conf > ... > > > >> >> Denis, could you provide one? In recent serveral days, I do a lot more benchmark, On a Dell Power Edge 2950, the problem reproduced, again, the attachment is the first 750 iscsi packets including logon phase, ietddiag.info is some information collection, while iscsi reading, here's the dstat output: root@uitnode1 ~/tmp/iet-r168 1 # dstat -M proc,cpu,mem,sys,net,disk,app -C 0,1 -D sda,sdb -N total,eth0,eth1,eth2 5 ---procs--- -------cpu0-usage--------------cpu1-usage------ ------memory-usage----- ---system-- -net/total----net/eth0----net/eth1- --dsk/sda-----dsk/sdb-- --most-expensive-- run blk new|usr sys idl wai hiq siq:usr sys idl wai hiq siq|_used _buff _cach _free|_int_ _csw_|_recv _send:_recv _send:_recv _send|_read _writ:_read _writ|_____process______ 0 0 0| 0 0 99 0 0 0: 1 2 95 0 0 2| 261M 171M 1795M 1066M|1531 1574 | 0 0 : 0 0 : 0 0 | 15k 11k: 22B 0 |istd3 3 3 0 0| 5 3 92 0 0 0: 0 46 0 0 2 52| 261M 171M 1795M 1066M|6988 4552 | 796k 116M: 796k 116M: 899B 0 | 0 0 : 0 0 |istd3 100 2 0 0| 7 3 90 0 0 0: 0 51 0 0 1 49| 261M 171M 1795M 1066M|7024 4605 | 801k 116M: 801k 116M: 553B 0 | 0 33k: 0 0 |istd3 100 2 0 0| 3 4 93 0 0 0: 0 48 0 0 1 51| 261M 171M 1795M 1066M|7035 4552 | 798k 116M: 797k 116M:1029B 0 | 0 0 : 0 0 |istd3 100 2 0 0| 3 3 94 0 0 0: 0 47 0 0 2 51| 261M 171M 1795M 1066M|7012 4529 | 798k 116M: 796k 116M:1585B 0 | 0 0 : 0 0 |istd3 100 1 0 0| 2 3 96 0 0 0: 0 48 0 0 1 51| 261M 171M 1795M 1066M|7030 4517 | 798k 116M: 797k 116M:1650B 0 | 0 0 : 0 0 |istd3 100 2 0 0| 1 4 95 0 0 0: 0 48 0 0 1 52| 261M 171M 1795M 1066M|7021 4519 | 799k - : 798k - :1431B 0 | 0 0 : 0 0 |istd3 100 2 0 0| 6 4 89 0 0 0: 0 48 0 0 0 52| 261M 171M 1795M 1066M|7023 4591 | 799k 116M: 798k 116M:1038B 0 | 0 0 : 0 0 |istd3 100^C cpu0 is idle, cpu1 is always busy with sys+siq; istd3 thread consume the 100% cpu1, |
From: rae l <cr...@gm...> - 2008-09-03 03:51:52
|
And here are some other dmesg with `insmod kernel/iscsi_trgt.ko debug_enable_flags=8` sendpage returns -EAGAIN very likely, I think this may consume much CPU, -~ iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 3004(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,3004,1092) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 1092(0,3004,1092) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4054807680,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4040847152,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4044210464,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(3724852752,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4040553600,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 2164(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,2164,1932) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 1932(0,2164,1932) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4043964416,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4053159296,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4044245248,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4054019888,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(3726032688,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 1324(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,1324,2772) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 2772(0,1324,2772) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4052098864,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4055172080,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4042202208,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4053423232,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 484(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(0,484,3612) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 3612(0,484,3612) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4044439344,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(3725111088,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4045645360,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4046479152,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 3740(4046466864,0,4096) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(4046466864,3740,356) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 356(4046466864,3740,356) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4042311168,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(3725065168,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 2900(0,0,4096) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2900,1196) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 1196(0,2900,1196) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4047425328,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(3724873520,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4045610800,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4046139184,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 2060(0,0,4096) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,2060,2036) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 2036(0,2060,2036) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4045766448,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(3724906456,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4054351664,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 1220(3724579536,0,4096) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(3724579536,1220,2876) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 2876(3724579536,1220,2876) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4040776512,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4053519240,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4040613680,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4041436688,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4043788080,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4057292592,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 380(0,0,4096) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: -11(0,380,3716) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 3716(0,380,3716) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4053683936,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(3724972896,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(4044635952,0,4096) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4045478400,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 3684(4044926768,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(4044926768,3684,412) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(4044926768,3684,412) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(4044926768,3684,412) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(4044926768,3684,412) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(4044926768,3684,412) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(4044926768,3684,412) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(4044926768,3684,412) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(4044926768,3684,412) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: -11(4044926768,3684,412) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 412(4044926768,3684,412) iscsi_trgt: write_data(318) 0x1000037010040:1: 48(48) iscsi_trgt: write_data(384) sendpage 0x1000037010040:1: 4096(4039419152,0,4096) iscsi_trgt: write_data(363) sendpage 0x1000037010040:1: 4096(0,0,4096) -- Denis Cheng Linux Application Developer "One of my most productive days was throwing away 1000 lines of code." - Ken Thompson. |
From: Ross S. W. W. <RW...@me...> - 2008-09-03 13:57:52
|
rae l wrote: > On Wed, Sep 3, 2008 at 3:02 AM, Ming Zhang > <bla...@gm...> wrote: > > > > On Tue, 2008-09-02 at 20:55 +0200, Arne Redlich wrote: > >> Am Dienstag, den 02.09.2008, 14:26 -0400 schrieb Ming Zhang: > >> > my 2c, a 5 second tcpdump from login to read might > reveal a lot of > >> > things. ;) > >> > >> Splendid idea, Ming :) > > > > thanks. > > > > also i know a lot of the company technical support group > have scripts > > that can collect various information from servers without rounds and > > rounds of emails. can we have one? > > > > including these can save a lot of time, i believe... > > > > uname -a > > cat /proc/cpuinfo > > cat /proc/meminfo > > cat /proc/net/iet/* > > lsmod > > lspci > > dmesg > > lsscsi > > ethtool ... > > cat /etc/ietd.conf > > ... > > > > > > > >> > >> Denis, could you provide one? > > In recent serveral days, I do a lot more benchmark, > > On a Dell Power Edge 2950, the problem reproduced, again, the > attachment is the first 750 iscsi packets including logon phase, > ietddiag.info is some information collection, > > while iscsi reading, here's the dstat output: > Thanks for the diag info I'm going to take a look at that, but can you give us a 'vmstat 1' during the high CPU to nullio lun so we can get an idea of the interrupts being driven. -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |
From: Ross S. W. W. <RW...@me...> - 2008-09-03 14:40:44
|
rae l wrote: > On Wed, Sep 3, 2008 at 3:02 AM, Ming Zhang > <bla...@gm...> wrote: > > > > On Tue, 2008-09-02 at 20:55 +0200, Arne Redlich wrote: > >> Am Dienstag, den 02.09.2008, 14:26 -0400 schrieb Ming Zhang: > >> > my 2c, a 5 second tcpdump from login to read might reveal a lot of > >> > things. ;) > >> > >> Splendid idea, Ming :) > > > > thanks. > > > > also i know a lot of the company technical support group have scripts > > that can collect various information from servers without rounds and > > rounds of emails. can we have one? > > > > including these can save a lot of time, i believe... > > > > uname -a > > cat /proc/cpuinfo > > cat /proc/meminfo > > cat /proc/net/iet/* > > lsmod > > lspci > > dmesg > > lsscsi > > ethtool ... > > cat /etc/ietd.conf > > ... > > > > > > > >> > >> Denis, could you provide one? > Rae, My mistake on the tcpdump, I didn't tell you to give a long enough snap length on the data, so all the good iscsi protocol information was truncated. I naively thought a raw dump would do the full packet, oh well. Can you run a: # tcpdump -c 750 -s 1460 -w iscsi.dump -i <int> tcp port 3260 Again from zero initiators to 1 initiator doing a read. I was able to determine that your MTU is 1500 and your MSS is 1460, so the snap length above should be good. You will need to compress it this time though, so just gzip it. -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |
From: Ross S. W. W. <RW...@me...> - 2008-09-03 14:55:25
|
rae l wrote: > > cpu0 is idle, cpu1 is always busy with sys+siq; istd3 thread consume > the 100% cpu1, I'm curious, what throughput were you getting on the initiator during these tests? It is possible that you were driving enough IO through the NIC to peg the CPU. iSCSI is a processor intensive protocol. -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |
From: rae l <cr...@gm...> - 2008-09-04 02:29:30
Attachments:
iscsi.dump.bz2
|
On Wed, Sep 3, 2008 at 9:57 PM, Ross S. W. Walker <RW...@me...> wrote: > Thanks for the diag info I'm going to take a look at that, > but can you give us a 'vmstat 1' during the high CPU to > nullio lun so we can get an idea of the interrupts being > driven. Do you notice the dstat output, `dstat` is generally a better vmstat, it collects cpu load, net throughput, most expensive app in the same time, here is the test output, almost the same as the previous one, It displays usr,sys,idl,wai,hiq,siq of every CPUs, here cpu0 is still idle, but cpu1 becomes busy with sys+siq, while the net/eth0 throughput becomes 110MB/s; root@uitnode1 ~/tmp/iet-r168 0 # dstat -M proc,cpu,mem,sys,net,disk,app -C 0,1 -D sda,sdb -N eth0 5 ---procs--- -------cpu0-usage--------------cpu1-usage------ ------memory-usage----- ---system-- --net/eth0- --dsk/sda-----dsk/sdb-- --most-expensive-- run blk new|usr sys idl wai hiq siq:usr sys idl wai hiq siq|_used _buff _cach _free|_int_ _csw_|_recv _send|_read _writ:_read _writ|_____process______ 0 0 0| 0 1 98 1 0 0: 0 0 98 1 0 0| 331M 139M 1189M 1634M| 246 102 |2548B 11k| 0 37k: 0 0 |gnome-terminal 1 0 0 0| 1 0 99 0 0 0: 0 0 100 0 0 0| 330M 139M 1190M 1634M| 297 162 |1612B 108k| 0 22k: 0 0 |istiod6 86 0 0 0| 1 0 99 0 0 0: 0 0 99 1 0 0| 330M 139M 1190M 1634M| 223 58 |1089B 0 | 0 14k: 0 0 |pcscd 0 2 0 0| 2 4 92 2 0 0: 0 37 22 0 1 40| 338M 139M 1190M 1626M| 11k 3550 |3850B 86M|4915B 188k: 0 0 |istd6 163 2 0 0| 2 7 91 0 0 0: 0 50 0 0 1 49| 338M 139M 1190M 1626M| 15k 4430 |3994B 113M| 0 0 : 0 0 |istd6 100 1 0 0| 2 2 97 0 0 0: 0 50 0 0 0 50| 338M 139M 1190M 1626M| 15k 4380 |1920B 112M| 0 0 : 0 0 |istd6 100 3 0 0| 2 2 96 0 0 0: 0 51 0 0 1 48| 338M 139M 1190M 1626M| 13k 4254 |2710B 109M| 0 0 : 0 0 |istd6 100 1 0 0| 1 3 96 0 0 0: 0 48 0 0 1 51| 338M 139M 1190M 1626M| 15k 4356 |1534B - | 0 3277B: 0 0 |istd6 100 2 0 0| 2 4 93 1 0 0: 0 51 0 0 1 48| 338M 139M 1190M 1626M| 14k 4304 |3689B 110M| 0 22k: 0 0 |istd6 100 4 0 0| 4 2 94 0 0 0: 0 51 0 0 1 48| 338M 139M 1190M 1626M| 14k 4340 |5716B 111M| 0 819B: 0 0 |istd6 100 5 0 0| 4 4 92 0 0 0: 0 45 0 0 1 54| 338M 139M 1190M 1626M| 16k 4491 |9003B 114M| 0 0 : 0 0 |istd6 100 2 0 0| 4 2 94 0 0 0: 0 49 0 0 0 51| 338M 139M 1190M 1626M| 15k 4488 |5306B 113M| 0 0 : 0 0 |istd6 100 1 0 0| 2 4 94 0 0 0: 0 49 2 0 0 48| 329M 139M 1190M 1634M| 14k 4144 |2974B 107M| 0 0 : 0 0 |istd6 97 0 0 0| 1 0 99 0 0 0: 2 0 98 0 0 0| 329M 139M 1190M 1634M| 274 127 |4262B 57k| 0 0 : 0 0 |gnome-terminal 2^C However, here is `vmstat 1` from when iscsi reading begun, cpu load falls from 100 to 50, means one cpu of the two became totally busy, # vmstat 1 0 0 108 1673412 142312 1218136 0 0 0 0 222 56 1 0 100 0 0 0 0 108 1673412 142320 1218128 0 0 0 72 224 68 0 1 98 2 0 0 0 108 1673412 142320 1218136 0 0 0 0 219 54 0 0 100 0 0 0 0 108 1673412 142320 1218136 0 0 0 0 219 62 1 0 100 0 0 8 0 108 1664848 142352 1218844 0 0 24 940 9206 3270 1 28 66 5 0 2 0 108 1664980 142352 1219012 0 0 0 0 13172 4337 1 51 48 0 0 1 0 108 1664980 142352 1219012 0 0 0 0 13674 4307 2 52 47 0 0 1 0 108 1664980 142352 1219012 0 0 0 0 15305 4434 1 57 42 0 0 2 0 108 1664980 142352 1219012 0 0 0 0 12999 4208 1 54 46 0 0 1 0 108 1664980 142352 1219012 0 0 0 0 15573 4451 1 52 48 0 0 1 0 108 1664980 142352 1219012 0 0 0 0 17470 4566 2 56 43 0 0 1 0 108 1664980 142352 1219012 0 0 0 0 14819 4536 1 51 48 0 0 1 0 108 1664980 142352 1219012 0 0 0 0 14932 4395 0 59 42 0 0 1 0 108 1665104 142352 1219012 0 0 0 0 13233 4242 1 52 48 0 0 1 0 108 1665104 142352 1219012 0 0 0 0 15343 4411 1 51 49 0 0 1 0 108 1665104 142352 1219012 0 0 0 0 14956 4377 1 51 49 0 0 2 0 108 1664980 142352 1219012 0 0 0 0 16134 4457 1 51 49 0 0 1 0 108 1664980 142352 1219012 0 0 0 0 14888 4363 1 51 49 0 0 1 0 108 1664980 142352 1219012 0 0 0 0 17138 4552 2 50 48 0 0 1 0 108 1665048 142352 1219012 0 0 0 0 13799 4281 1 51 49 0 0 1 0 108 1665036 142352 1219012 0 0 0 0 12202 4153 1 51 48 0 0 On Wed, Sep 3, 2008 at 10:40 PM, Ross S. W. Walker <RW...@me...> wrote: > My mistake on the tcpdump, I didn't tell you to give a long > enough snap length on the data, so all the good iscsi protocol > information was truncated. I naively thought a raw dump would > do the full packet, oh well. > > Can you run a: > > # tcpdump -c 750 -s 1460 -w iscsi.dump -i <int> tcp port 3260 Done, the attachment is new one. On Wed, Sep 3, 2008 at 10:55 PM, Ross S. W. Walker <RW...@me...> wrote: > rae l wrote: >> >> cpu0 is idle, cpu1 is always busy with sys+siq; istd3 thread consume >> the 100% cpu1, > > I'm curious, what throughput were you getting on the initiator during > these tests? The initiator is "microsoft iscsi initiator", the iometer benchmark got 105MB/s throughput. > > It is possible that you were driving enough IO through the NIC to peg > the CPU. iSCSI is a processor intensive protocol. > > -Ross Another interesting test is, if use open-iscsi initiator on an Linux as the client, the benchmark can also get 110MB/s bandwidth, while IET on the server don't consume high CPU; It seems IET cannot collaborate well with microsoft iscsi initiator, but it should handle this. -- Denis Cheng Linux Application Developer "One of my most productive days was throwing away 1000 lines of code." - Ken Thompson. |