From: Bart V. A. <bar...@gm...> - 2008-01-17 09:27:11
|
Hello, I have performed a test to compare the performance of SCST and STGT. Apparently the SCST target implementation performed far better than the STGT target implementation. This makes me wonder whether this is due to the design of SCST or whether STGT's performance can be improved to the level of SCST ? Test performed: read 2 GB of data in blocks of 1 MB from a target (hot cache -- no disk reads were performed, all reads were from the cache). Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 STGT read SCST read performance (MB/s) performance (MB/s) Ethernet (1 Gb/s network) 77 89 IPoIB (8 Gb/s network) 82 229 SRP (8 Gb/s network) N/A 600 iSER (8 Gb/s network) 80 N/A These results show that SCST uses the InfiniBand network very well (effectivity of about 88% via SRP), but that the current STGT version is unable to transfer data faster than 82 MB/s. Does this mean that there is a severe bottleneck present in the current STGT implementation ? Details about the test equipment: - Ethernet controller: Intel 80003ES2LAN Gigabit Ethernet controller (copper) in full duplex mode. - InfiniBand controller: Mellanox MT25204 [InfiniHost III Lx HCA]. According to ib_rdma_bw and ib_rdma_lat, the InfiniBand peak bandwith on this system is 675 MB/sec and its latency is 3 microseconds. - CPU: one CPU, an Intel Xeon CPU 5130 @ 2.00GHz. - RAM: 2 GB in the initiator, 8 GB in the target. According to lmbench, memory read bandwidth is 2960 MB/s and write bandwidth is 1080 MB/s. - Software: 64-bit Ubuntu 7.10 server edition + OFED 1.2.5.4 userspace components + SCST revision 242 (January 4, 2008) + TGT version 20071227. Regards, Bart Van Assche. |
From: FUJITA T. <fuj...@la...> - 2008-01-17 09:41:03
|
On Thu, 17 Jan 2008 10:27:08 +0100 "Bart Van Assche" <bar...@gm...> wrote: > Hello, > > I have performed a test to compare the performance of SCST and STGT. > Apparently the SCST target implementation performed far better than > the STGT target implementation. This makes me wonder whether this is > due to the design of SCST or whether STGT's performance can be > improved to the level of SCST ? > > Test performed: read 2 GB of data in blocks of 1 MB from a target (hot > cache -- no disk reads were performed, all reads were from the cache). > Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 > > STGT read SCST read > performance (MB/s) performance (MB/s) > Ethernet (1 Gb/s network) 77 89 > IPoIB (8 Gb/s network) 82 229 > SRP (8 Gb/s network) N/A 600 > iSER (8 Gb/s network) 80 N/A > > These results show that SCST uses the InfiniBand network very well > (effectivity of about 88% via SRP), but that the current STGT version > is unable to transfer data faster than 82 MB/s. Does this mean that > there is a severe bottleneck present in the current STGT > implementation ? I don't know about the details but Pete said that he can achieve more than 900MB/s read performance with tgt iSER target using ramdisk. http://www.mail-archive.com/stg...@li.../msg00004.html |
From: Vladislav B. <vs...@vl...> - 2008-01-17 09:48:42
|
FUJITA Tomonori wrote: > On Thu, 17 Jan 2008 10:27:08 +0100 > "Bart Van Assche" <bar...@gm...> wrote: > > >>Hello, >> >>I have performed a test to compare the performance of SCST and STGT. >>Apparently the SCST target implementation performed far better than >>the STGT target implementation. This makes me wonder whether this is >>due to the design of SCST or whether STGT's performance can be >>improved to the level of SCST ? >> >>Test performed: read 2 GB of data in blocks of 1 MB from a target (hot >>cache -- no disk reads were performed, all reads were from the cache). >>Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 >> >> STGT read SCST read >> performance (MB/s) performance (MB/s) >>Ethernet (1 Gb/s network) 77 89 >>IPoIB (8 Gb/s network) 82 229 >>SRP (8 Gb/s network) N/A 600 >>iSER (8 Gb/s network) 80 N/A >> >>These results show that SCST uses the InfiniBand network very well >>(effectivity of about 88% via SRP), but that the current STGT version >>is unable to transfer data faster than 82 MB/s. Does this mean that >>there is a severe bottleneck present in the current STGT >>implementation ? > > > I don't know about the details but Pete said that he can achieve more > than 900MB/s read performance with tgt iSER target using ramdisk. > > http://www.mail-archive.com/stg...@li.../msg00004.html Please don't confuse multithreaded latency insensitive workload with single threaded, hence latency sensitive one. > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to maj...@vg... > More majordomo info at http://vger.kernel.org/majordomo-info.html > |
From: FUJITA T. <fuj...@la...> - 2008-01-17 10:06:08
|
On Thu, 17 Jan 2008 12:48:28 +0300 Vladislav Bolkhovitin <vs...@vl...> wrote: > FUJITA Tomonori wrote: > > On Thu, 17 Jan 2008 10:27:08 +0100 > > "Bart Van Assche" <bar...@gm...> wrote: > > > > > >>Hello, > >> > >>I have performed a test to compare the performance of SCST and STGT. > >>Apparently the SCST target implementation performed far better than > >>the STGT target implementation. This makes me wonder whether this is > >>due to the design of SCST or whether STGT's performance can be > >>improved to the level of SCST ? > >> > >>Test performed: read 2 GB of data in blocks of 1 MB from a target (hot > >>cache -- no disk reads were performed, all reads were from the cache). > >>Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 > >> > >> STGT read SCST read > >> performance (MB/s) performance (MB/s) > >>Ethernet (1 Gb/s network) 77 89 > >>IPoIB (8 Gb/s network) 82 229 > >>SRP (8 Gb/s network) N/A 600 > >>iSER (8 Gb/s network) 80 N/A > >> > >>These results show that SCST uses the InfiniBand network very well > >>(effectivity of about 88% via SRP), but that the current STGT version > >>is unable to transfer data faster than 82 MB/s. Does this mean that > >>there is a severe bottleneck present in the current STGT > >>implementation ? > > > > > > I don't know about the details but Pete said that he can achieve more > > than 900MB/s read performance with tgt iSER target using ramdisk. > > > > http://www.mail-archive.com/stg...@li.../msg00004.html > > Please don't confuse multithreaded latency insensitive workload with > single threaded, hence latency sensitive one. Seems that he can get good performance with single threaded workload: http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf But I don't know about the details so let's wait for Pete to comment on this. Perhaps Voltaire people could comment on the tgt iSER performances. |
From: Vladislav B. <vs...@vl...> - 2008-01-17 10:34:54
|
FUJITA Tomonori wrote: > On Thu, 17 Jan 2008 12:48:28 +0300 > Vladislav Bolkhovitin <vs...@vl...> wrote: > > >>FUJITA Tomonori wrote: >> >>>On Thu, 17 Jan 2008 10:27:08 +0100 >>>"Bart Van Assche" <bar...@gm...> wrote: >>> >>> >>> >>>>Hello, >>>> >>>>I have performed a test to compare the performance of SCST and STGT. >>>>Apparently the SCST target implementation performed far better than >>>>the STGT target implementation. This makes me wonder whether this is >>>>due to the design of SCST or whether STGT's performance can be >>>>improved to the level of SCST ? >>>> >>>>Test performed: read 2 GB of data in blocks of 1 MB from a target (hot >>>>cache -- no disk reads were performed, all reads were from the cache). >>>>Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 >>>> >>>> STGT read SCST read >>>> performance (MB/s) performance (MB/s) >>>>Ethernet (1 Gb/s network) 77 89 >>>>IPoIB (8 Gb/s network) 82 229 >>>>SRP (8 Gb/s network) N/A 600 >>>>iSER (8 Gb/s network) 80 N/A >>>> >>>>These results show that SCST uses the InfiniBand network very well >>>>(effectivity of about 88% via SRP), but that the current STGT version >>>>is unable to transfer data faster than 82 MB/s. Does this mean that >>>>there is a severe bottleneck present in the current STGT >>>>implementation ? >>> >>> >>>I don't know about the details but Pete said that he can achieve more >>>than 900MB/s read performance with tgt iSER target using ramdisk. >>> >>>http://www.mail-archive.com/stg...@li.../msg00004.html >> >>Please don't confuse multithreaded latency insensitive workload with >>single threaded, hence latency sensitive one. > > > Seems that he can get good performance with single threaded workload: > > http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf Hmm, I can't find which IB hardware did he use and it's declared Gbps speed. He declared only "Mellanox 4X SDR, switch". What does it mean? > But I don't know about the details so let's wait for Pete to comment > on this. I added him on CC > Perhaps Voltaire people could comment on the tgt iSER performances. > |
From: Pete W. <pw...@os...> - 2008-01-17 17:45:58
|
fuj...@la... wrote on Thu, 17 Jan 2008 19:05 +0900: > On Thu, 17 Jan 2008 12:48:28 +0300 > Vladislav Bolkhovitin <vs...@vl...> wrote: > > > FUJITA Tomonori wrote: > > > On Thu, 17 Jan 2008 10:27:08 +0100 > > > "Bart Van Assche" <bar...@gm...> wrote: > > > > > > > > >>Hello, > > >> > > >>I have performed a test to compare the performance of SCST and STGT. > > >>Apparently the SCST target implementation performed far better than > > >>the STGT target implementation. This makes me wonder whether this is > > >>due to the design of SCST or whether STGT's performance can be > > >>improved to the level of SCST ? > > >> > > >>Test performed: read 2 GB of data in blocks of 1 MB from a target (hot > > >>cache -- no disk reads were performed, all reads were from the cache). > > >>Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 > > >> > > >> STGT read SCST read > > >> performance (MB/s) performance (MB/s) > > >>Ethernet (1 Gb/s network) 77 89 > > >>IPoIB (8 Gb/s network) 82 229 > > >>SRP (8 Gb/s network) N/A 600 > > >>iSER (8 Gb/s network) 80 N/A > > >> > > >>These results show that SCST uses the InfiniBand network very well > > >>(effectivity of about 88% via SRP), but that the current STGT version > > >>is unable to transfer data faster than 82 MB/s. Does this mean that > > >>there is a severe bottleneck present in the current STGT > > >>implementation ? > > > > > > > > > I don't know about the details but Pete said that he can achieve more > > > than 900MB/s read performance with tgt iSER target using ramdisk. > > > > > > http://www.mail-archive.com/stg...@li.../msg00004.html > > > > Please don't confuse multithreaded latency insensitive workload with > > single threaded, hence latency sensitive one. > > Seems that he can get good performance with single threaded workload: > > http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf > > But I don't know about the details so let's wait for Pete to comment > on this. Page 16 is pretty straight forward. One command outstanding from the client. It is an OSD read command. Data on tmpfs. 500 MB/s is pretty easy to get on IB. The other graph on page 23 is for block commands. 600 MB/s ish. Still single command; so essentially a "latency" test. Dominated by the memcpy time from tmpfs to pinned IB buffer, as per page 24. Erez said: > We didn't run any real performance test with tgt, so I don't have > numbers yet. I know that Pete got ~900 MB/sec by hacking sgp_dd, so all > data was read/written to the same block (so it was all done in the > cache). Pete - am I right? Yes (actually just 1 thread in sg_dd). This is obviously cheating. Take the pread time to zero in SCSI Read analysis on page 24 to show max theoretical. It's IB theoretical minus some initiator and stgt overheads. The other way to get more read throughput is to throw multiple simultaneous commands at the server. There's nothing particularly stunning here. Suspect Bart has configuration issues if not even IPoIB will do > 100 MB/s. -- Pete |
From: Bart V. A. <bar...@gm...> - 2008-01-18 10:30:51
|
On Jan 17, 2008 6:45 PM, Pete Wyckoff <pw...@os...> wrote: > There's nothing particularly stunning here. Suspect Bart has > configuration issues if not even IPoIB will do > 100 MB/s. Regarding configuration issues: the systems I ran the test on probably communicate via PCI-e x4 with the InfiniBand HCA's. With other systems with identical software and with PCI-e x8 HCA's on the same InfiniBand network I reach a throughput of 934 MB/s instead of 675 MB/s (PCI-e x4). This is something I only found out today, otherwise I would have run all tests on the systems with PCI-e x8 HCA's. So the relative utilization of the InfiniBand network is as follows: * STGT + iSER, PCI-e x4 HCA: 324/675 = 48% (measured myself) * STGT + iSER, PCI-e x8 HCA: 550/934 = 59% (http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf) * SCST + SRP, PCI-e x4 HCA: 600/675 = 89% (measured myself) Or: SCST uses the InfiniBand network much more effectively than STGT. Bart. |
From: Vladislav B. <vs...@vl...> - 2008-01-18 12:09:04
|
Pete Wyckoff wrote: >>>>>I have performed a test to compare the performance of SCST and STGT. >>>>>Apparently the SCST target implementation performed far better than >>>>>the STGT target implementation. This makes me wonder whether this is >>>>>due to the design of SCST or whether STGT's performance can be >>>>>improved to the level of SCST ? >>>>> >>>>>Test performed: read 2 GB of data in blocks of 1 MB from a target (hot >>>>>cache -- no disk reads were performed, all reads were from the cache). >>>>>Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 >>>>> >>>>> STGT read SCST read >>>>> performance (MB/s) performance (MB/s) >>>>>Ethernet (1 Gb/s network) 77 89 >>>>>IPoIB (8 Gb/s network) 82 229 >>>>>SRP (8 Gb/s network) N/A 600 >>>>>iSER (8 Gb/s network) 80 N/A >>>>> >>>>>These results show that SCST uses the InfiniBand network very well >>>>>(effectivity of about 88% via SRP), but that the current STGT version >>>>>is unable to transfer data faster than 82 MB/s. Does this mean that >>>>>there is a severe bottleneck present in the current STGT >>>>>implementation ? >>>> >>>> >>>>I don't know about the details but Pete said that he can achieve more >>>>than 900MB/s read performance with tgt iSER target using ramdisk. >>>> >>>>http://www.mail-archive.com/stg...@li.../msg00004.html >>> >>>Please don't confuse multithreaded latency insensitive workload with >>>single threaded, hence latency sensitive one. >> >>Seems that he can get good performance with single threaded workload: >> >>http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf >> >>But I don't know about the details so let's wait for Pete to comment >>on this. > > Page 16 is pretty straight forward. One command outstanding from > the client. It is an OSD read command. Data on tmpfs. Hmm, I wouldn't say it's pretty straight forward. It has data for "InfiniBand" and it's unclear if it's using iSER or some IB performance test tool. I would rather interpret those data as for IB, not iSER. > 500 MB/s is > pretty easy to get on IB. > > The other graph on page 23 is for block commands. 600 MB/s ish. > Still single command; so essentially a "latency" test. Dominated by > the memcpy time from tmpfs to pinned IB buffer, as per page 24. > > Erez said: > > >>We didn't run any real performance test with tgt, so I don't have >>numbers yet. I know that Pete got ~900 MB/sec by hacking sgp_dd, so all >>data was read/written to the same block (so it was all done in the >>cache). Pete - am I right? > > Yes (actually just 1 thread in sg_dd). This is obviously cheating. > Take the pread time to zero in SCSI Read analysis on page 24 to show > max theoretical. It's IB theoretical minus some initiator and stgt > overheads. Yes, that's obviously cheating and its result can't be compared with what Bart had. Full data footprint on target fit in the CPU cache, so you had rather results for NULLIO (SCST term). So, seems I understood your slides correctly: the more valuable data for our SCST SRP vs STGT iSER comparison should be on page 26 for 1 command read (~480MB/s, i.e. ~60% from Bart's result on the equivalent hardware). > The other way to get more read throughput is to throw multiple > simultaneous commands at the server. > > There's nothing particularly stunning here. Suspect Bart has > configuration issues if not even IPoIB will do > 100 MB/s. > > -- Pete > > |
From: Bart V. A. <bar...@gm...> - 2008-01-20 09:36:20
|
On Jan 18, 2008 1:08 PM, Vladislav Bolkhovitin <vs...@vl...> wrote: > > [ ... ] > So, seems I understood your slides correctly: the more valuable data for > our SCST SRP vs STGT iSER comparison should be on page 26 for 1 command > read (~480MB/s, i.e. ~60% from Bart's result on the equivalent hardware). At least in my tests SCST performed significantly better than STGT. These tests were performed with the currently available implementations of SCST and STGT. Which performance improvements are possible for these projects (e.g. zero-copying), and by how much is it expected that these performance improvements will increase throughput and will decrease latency ? Bart. |
From: Vladislav B. <vs...@vl...> - 2008-01-21 12:07:44
|
Bart Van Assche wrote: > On Jan 18, 2008 1:08 PM, Vladislav Bolkhovitin <vs...@vl...> wrote: > >>[ ... ] >>So, seems I understood your slides correctly: the more valuable data for >>our SCST SRP vs STGT iSER comparison should be on page 26 for 1 command >>read (~480MB/s, i.e. ~60% from Bart's result on the equivalent hardware). > > > At least in my tests SCST performed significantly better than STGT. > These tests were performed with the currently available > implementations of SCST and STGT. Which performance improvements are > possible for these projects (e.g. zero-copying), and by how much is it > expected that these performance improvements will increase throughput > and will decrease latency ? Sure, zero-copying cache support is well possible for SCST and hopefully will be available soon. The performance (throughput) improvement will depend from used hardware and data access pattern, but the upper bound estimation can be made knowing memory copy throughput on your system (1.6GB/s according to your measurements). For 10Gbps link with 0.9GB/s wire speed it should be up to 30%, for 20Gbps link with wire speed 1.5GB/s (PCI-E 8x limitation) - something up to 70-80%. Vlad |
From: FUJITA T. <fuj...@la...> - 2008-01-22 03:27:13
|
On Sun, 20 Jan 2008 10:36:18 +0100 "Bart Van Assche" <bar...@gm...> wrote: > On Jan 18, 2008 1:08 PM, Vladislav Bolkhovitin <vs...@vl...> wrote: > > > > [ ... ] > > So, seems I understood your slides correctly: the more valuable data for > > our SCST SRP vs STGT iSER comparison should be on page 26 for 1 command > > read (~480MB/s, i.e. ~60% from Bart's result on the equivalent hardware). > > At least in my tests SCST performed significantly better than STGT. > These tests were performed with the currently available > implementations of SCST and STGT. Which performance improvements are First, I recommend you to examine iSER stuff more since it has some parameters unlike SRP, which effects the performance, IIRC. At least, you could get the iSER performances similar to Pete's. > possible for these projects (e.g. zero-copying), and by how much is it > expected that these performance improvements will increase throughput > and will decrease latency ? The major bottleneck about RDMA transfer is registering the buffer before transfer. stgt's iSER driver has pre-registered buffers and move data between page cache and thsse buffers, and then does RDMA transfer. The big problem of stgt iSER is disk I/Os (move data between disk and page cache). We need a proper asynchronous I/O mechanism, however, Linux doesn't provide such and we use a workaround, which incurs large latency. I guess, we cannot solve this until syslets is merged into mainline. The above approach still needs one memory copy (between the pre-registered buffers and page cahce). If we need more performance, we have to implement a new caching mechanism using the pre-registered buffers instead of just using page cache. AIO with O_DIRECT enables us to implement such caching mechanism (we can use eventfd so we don't need something like syslets, that is, we can implement such now). I'm not sure someone will implement such RDMA caching mechanism for stgt. Pete and his colleagues implemented stgt iSER driver (thanks!) but they are not interested in block I/Os (they are OSD people). |
From: Bart V. A. <bar...@gm...> - 2008-01-22 07:50:18
|
On Jan 22, 2008 4:26 AM, FUJITA Tomonori <fuj...@la...> wrote: > First, I recommend you to examine iSER stuff more since it has some > parameters unlike SRP, which effects the performance, IIRC. At least, > you could get the iSER performances similar to Pete's. Documentation about configuring iSER parameters at the initiator side appears to be hard to find. A Google query for (iscsiadm "op update" "v iser" -- http://www.google.com/search?q=iscsiadm+%22op+update%22+%22v+iser%22) gave only one result: http://www.mail-archive.com/stg...@li.../msg00033.html. I also found an update of this document: http://www.mail-archive.com/stg...@li.../msg00133.html. Are you referring to parameters like MaxRecvDataSegmentLength, TargetRecvDataSegmentLength, InitiatorRecvDataSegmentLength and MaxOutstandingUnexpectedPDUs as explained in RFC 5046 (http://www.ietf.org/rfc/rfc5046.txt) ? It would be interesting for me to know which values Pete had configured in his tests, such that I can configure the same values for these parameters. Bart. |
From: Vladislav B. <vs...@vl...> - 2008-01-22 11:33:15
|
FUJITA Tomonori wrote: > The big problem of stgt iSER is disk I/Os (move data between disk and > page cache). We need a proper asynchronous I/O mechanism, however, > Linux doesn't provide such and we use a workaround, which incurs large > latency. I guess, we cannot solve this until syslets is merged into > mainline. Hmm, SCST also doesn't have ability to use asynchronous I/O, but that doesn't prevent it from showing good performance. Vlad |
From: FUJITA T. <fuj...@la...> - 2008-01-22 11:51:37
|
On Tue, 22 Jan 2008 14:33:13 +0300 Vladislav Bolkhovitin <vs...@vl...> wrote: > FUJITA Tomonori wrote: > > The big problem of stgt iSER is disk I/Os (move data between disk and > > page cache). We need a proper asynchronous I/O mechanism, however, > > Linux doesn't provide such and we use a workaround, which incurs large > > latency. I guess, we cannot solve this until syslets is merged into > > mainline. > > Hmm, SCST also doesn't have ability to use asynchronous I/O, but that > doesn't prevent it from showing good performance. I don't know how SCST performs I/Os, but surely, in kernel space, you can performs I/Os asynchronously. Or you use an event notification mechanism with multiple kernel threads performing I/Os synchronously. Xen blktap has the same problem as stgt. IIRC, Xen mainline uses a kernel patch to add a proper event notification to AIO though redhat uses the same workaround as stgt instead of applying the kernel patch. |
From: Vladislav B. <vs...@vl...> - 2008-01-22 12:20:55
|
FUJITA Tomonori wrote: > On Tue, 22 Jan 2008 14:33:13 +0300 > Vladislav Bolkhovitin <vs...@vl...> wrote: > > >>FUJITA Tomonori wrote: >> >>>The big problem of stgt iSER is disk I/Os (move data between disk and >>>page cache). We need a proper asynchronous I/O mechanism, however, >>>Linux doesn't provide such and we use a workaround, which incurs large >>>latency. I guess, we cannot solve this until syslets is merged into >>>mainline. >> >>Hmm, SCST also doesn't have ability to use asynchronous I/O, but that >>doesn't prevent it from showing good performance. > > > I don't know how SCST performs I/Os, but surely, in kernel space, you > can performs I/Os asynchronously. Sure, but currently it all synchronous > Or you use an event notification > mechanism with multiple kernel threads performing I/Os synchronously. > > Xen blktap has the same problem as stgt. IIRC, Xen mainline uses a > kernel patch to add a proper event notification to AIO though redhat > uses the same workaround as stgt instead of applying the kernel patch. > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to maj...@vg... > More majordomo info at http://vger.kernel.org/majordomo-info.html > |
From: Bart V. A. <bar...@gm...> - 2008-01-22 15:14:42
|
On Jan 22, 2008 4:26 AM, FUJITA Tomonori <fuj...@la...> wrote: > > First, I recommend you to examine iSER stuff more since it has some > parameters unlike SRP, which effects the performance, IIRC. At least, > you could get the iSER performances similar to Pete's. Apparently open-iscsi uses the following defaults: node.session.iscsi.FirstBurstLength = 262144 node.session.iscsi.MaxBurstLength = 16776192 node.conn[0].tcp.window_size = 524288 node.conn[0].iscsi.MaxRecvDataSegmentLength = 131072 I have tried to change some of these parameters to a larger value, but this did not have a noticeable effect (read bandwidth increased less than 1%): $ iscsiadm --mode node --targetname iqn.2007-05.com.example:storage.disk2.sys1.xyz --portal 192.168.102.5:3260 --op update -n node.session.iscsi.FirstBurstLength -v 16776192 $ iscsiadm --mode node --targetname iqn.2007-05.com.example:storage.disk2.sys1.xyz --portal 192.168.102.5:3260 --op update -n node.session.iscsi.MaxBurstLength -v 16776192 $ iscsiadm --mode node --targetname iqn.2007-05.com.example:storage.disk2.sys1.xyz --portal 192.168.102.5:3260 --op update -n "node.conn[0].iscsi.MaxRecvDataSegmentLength" -v 16776192 Bart. |
From: Bart V. A. <bar...@gm...> - 2008-01-22 10:04:28
|
On Jan 17, 2008 6:45 PM, Pete Wyckoff <pw...@os...> wrote: > There's nothing particularly stunning here. Suspect Bart has > configuration issues if not even IPoIB will do > 100 MB/s. By this time I found out that the BIOS of the test systems (Intel Server Board S5000PAL) set the PCI-e parameter MaxReadReq to 128 bytes, which explains the low InfiniBand performance. After changing this parameter to 4096 bytes the InfiniBand throughput was as expected: ib_rdma_bw now reports a bandwidth of 933 MB/s. Bart. |
From: Vladislav B. <vs...@vl...> - 2008-01-22 11:33:54
|
Bart Van Assche wrote: > On Jan 17, 2008 6:45 PM, Pete Wyckoff <pw...@os...> wrote: > >>There's nothing particularly stunning here. Suspect Bart has >>configuration issues if not even IPoIB will do > 100 MB/s. > > > By this time I found out that the BIOS of the test systems (Intel > Server Board S5000PAL) set the PCI-e parameter MaxReadReq to 128 > bytes, which explains the low InfiniBand performance. After changing > this parameter to 4096 bytes the InfiniBand throughput was as > expected: ib_rdma_bw now reports a > bandwidth of 933 MB/s. What are the new SRPT/iSER numbers? > Bart. > |
From: Bart V. A. <bar...@gm...> - 2008-01-22 12:32:11
|
On Jan 22, 2008 12:33 PM, Vladislav Bolkhovitin <vs...@vl...> wrote: > > What are the new SRPT/iSER numbers? You can find the new performance numbers below. These are all numbers for reading from the remote buffer cache, no actual disk reads were performed. The read tests have been performed with dd, both for a block size of 512 bytes and of 1 MB. The tests with small block size learn more about latency, while the tests with large block size learn more about the maximal possible throughput. ............................................................................................. . . STGT read SCST read . STGT read SCST read . . . performance performance . performance performance . . . (0.5K, MB/s) (0.5K, MB/s) . (1 MB, MB/s) (1 MB, MB/s) . ............................................................................................. . Ethernet (1 Gb/s network) . 77 78 . 77 89 . . IPoIB (8 Gb/s network) . 163 185 . 201 239 . . iSER (8 Gb/s network) . 250 N/A . 360 N/A . . SRP (8 Gb/s network) . N/A 421 . N/A 683 . ............................................................................................. My conclusion from the above numbers: the performance difference between STGT and SCST is small for a Gigabit Ethernet network. The faster the network technology, the larger the difference between SCST and STGT. Bart. |
From: Vladislav B. <vs...@vl...> - 2008-01-22 15:23:30
|
Bart Van Assche wrote: > On Jan 22, 2008 12:33 PM, Vladislav Bolkhovitin <vs...@vl... > <mailto:vs...@vl...>> wrote: >> >> What are the new SRPT/iSER numbers? > > You can find the new performance numbers below. These are all numbers > for reading from the remote buffer cache, no actual disk reads were > performed. The read tests have been performed with dd, both for a block > size of 512 bytes and of 1 MB. The tests with small block size learn > more about latency, while the tests with large block size learn more > about the maximal possible throughput. If you want to compare performance of 512b vs 1MB blocks, your experiment isn't fully correct. You should use "iflag=direct" dd option for that. > ............................................................................................. > > . . STGT read SCST read . STGT > read SCST read . > . . performance performance . > performance performance . > . . (0.5K, MB/s) (0.5K, MB/s) . (1 MB, > MB/s) (1 MB, MB/s) . > ............................................................................................. > . Ethernet (1 Gb/s network) . 77 78 . > 77 89 . > . IPoIB (8 Gb/s network) . 163 185 . > 201 239 . > . iSER (8 Gb/s network) . 250 N/A . > 360 N/A . > . SRP (8 Gb/s network) . N/A 421 . > N/A 683 . > ............................................................................................. > > My conclusion from the above numbers: the performance difference between > STGT and SCST is small for a Gigabit Ethernet network. The faster the > network technology, the larger the difference between SCST and STGT. This is what I expected > Bart. |
From: Bart V. A. <bar...@gm...> - 2008-01-22 12:33:12
|
On Jan 22, 2008 12:33 PM, Vladislav Bolkhovitin <vs...@vl...> wrote: > > What are the new SRPT/iSER numbers? You can find the new performance numbers below. These are all numbers for reading from the remote buffer cache, no actual disk reads were performed. The read tests have been performed with dd, both for a block size of 512 bytes and of 1 MB. The tests with small block size learn more about latency, while the tests with large block size learn more about the maximal possible throughput. ............................................................................................. . . STGT read SCST read . STGT read SCST read . . . performance performance . performance performance . . . (0.5K, MB/s) (0.5K, MB/s) . (1 MB, MB/s) (1 MB, MB/s) . ............................................................................................. . Ethernet (1 Gb/s network) . 77 78 . 77 89 . . IPoIB (8 Gb/s network) . 163 185 . 201 239 . . iSER (8 Gb/s network) . 250 N/A . 360 N/A . . SRP (8 Gb/s network) . N/A 421 . N/A 683 . ............................................................................................. My conclusion from the above numbers: the performance difference between STGT and SCST is small for a Gigabit Ethernet network. The faster the network technology, the larger the difference between SCST and STGT. Bart. |
From: Matteo T. <ma...@rm...> - 2008-01-22 15:21:36
|
Hi Bart, Can u explain better how to change this parameter? Is it done bye setpci? We use the same Intel server board, and we see MaxPayload 128bytes and MaxReadReq to 512 bytes in both Raid 3ware 9650se controller and Intel Gigabit Ethernet Controller. Thanks in advance, -- matteo Il 22-01-2008 11:04, "Bart Van Assche" <bar...@gm...> ha scritto: > On Jan 17, 2008 6:45 PM, Pete Wyckoff <pw...@os...> wrote: >> There's nothing particularly stunning here. Suspect Bart has >> configuration issues if not even IPoIB will do > 100 MB/s. > > By this time I found out that the BIOS of the test systems (Intel > Server Board S5000PAL) set the PCI-e parameter MaxReadReq to 128 > bytes, which explains the low InfiniBand performance. After changing > this parameter to 4096 bytes the InfiniBand throughput was as > expected: ib_rdma_bw now reports a > bandwidth of 933 MB/s. > > Bart. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Scst-devel mailing list > Scs...@li... > https://lists.sourceforge.net/lists/listinfo/scst-devel > |
From: Bart V. A. <bar...@gm...> - 2008-01-22 15:27:44
|
On Jan 22, 2008 4:19 PM, Matteo Tescione <ma...@rm...> wrote: > Hi Bart, > > Can u explain better how to change this parameter? Is it done bye setpci? > We use the same Intel server board, and we see MaxPayload 128bytes and > MaxReadReq to 512 bytes in both Raid 3ware 9650se controller and Intel > Gigabit Ethernet Controller. The proper way is to upgrade the BIOS, changing PCI parameters can cause system instability. What I did is to unload and reload the ib_mthca kernel module with parameter tune_pci=1 (only applies to Mellanox HCA's of course). See also http://lists.openfabrics.org/pipermail/general/2008-January/045134.html and http://lists.openfabrics.org/pipermail/general/2008-January/045141.html. Bart. |
From: Erez Z. <erezz@Voltaire.COM> - 2008-01-17 14:22:29
|
FUJITA Tomonori wrote: > On Thu, 17 Jan 2008 12:48:28 +0300 > Vladislav Bolkhovitin <vs...@vl...> wrote: > > >> FUJITA Tomonori wrote: >> >>> On Thu, 17 Jan 2008 10:27:08 +0100 >>> "Bart Van Assche" <bar...@gm...> wrote: >>> >>> >>> >>>> Hello, >>>> >>>> I have performed a test to compare the performance of SCST and STGT. >>>> Apparently the SCST target implementation performed far better than >>>> the STGT target implementation. This makes me wonder whether this is >>>> due to the design of SCST or whether STGT's performance can be >>>> improved to the level of SCST ? >>>> >>>> Test performed: read 2 GB of data in blocks of 1 MB from a target (hot >>>> cache -- no disk reads were performed, all reads were from the cache). >>>> Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 >>>> >>>> STGT read SCST read >>>> performance (MB/s) performance (MB/s) >>>> Ethernet (1 Gb/s network) 77 89 >>>> IPoIB (8 Gb/s network) 82 229 >>>> SRP (8 Gb/s network) N/A 600 >>>> iSER (8 Gb/s network) 80 N/A >>>> >>>> These results show that SCST uses the InfiniBand network very well >>>> (effectivity of about 88% via SRP), but that the current STGT version >>>> is unable to transfer data faster than 82 MB/s. Does this mean that >>>> there is a severe bottleneck present in the current STGT >>>> implementation ? >>>> >>> I don't know about the details but Pete said that he can achieve more >>> than 900MB/s read performance with tgt iSER target using ramdisk. >>> >>> http://www.mail-archive.com/stg...@li.../msg00004.html >>> >> Please don't confuse multithreaded latency insensitive workload with >> single threaded, hence latency sensitive one. >> > > Seems that he can get good performance with single threaded workload: > > http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf > > > But I don't know about the details so let's wait for Pete to comment > on this. > > Perhaps Voltaire people could comment on the tgt iSER performances. > We didn't run any real performance test with tgt, so I don't have numbers yet. I know that Pete got ~900 MB/sec by hacking sgp_dd, so all data was read/written to the same block (so it was all done in the cache). Pete - am I right? As already mentioned, he got that with IB SDR cards that are 10 Gb/sec cards in theory (actual speed is ~900 MB/sec). With DDR cards (20 Gb/sec), you can get even more. I plan to test that in the near future. Erez |
From: Vladislav B. <vs...@vl...> - 2008-01-17 14:32:57
|
Erez Zilber wrote: > FUJITA Tomonori wrote: > >>On Thu, 17 Jan 2008 12:48:28 +0300 >>Vladislav Bolkhovitin <vs...@vl...> wrote: >> >> >> >>>FUJITA Tomonori wrote: >>> >>> >>>>On Thu, 17 Jan 2008 10:27:08 +0100 >>>>"Bart Van Assche" <bar...@gm...> wrote: >>>> >>>> >>>> >>>> >>>>>Hello, >>>>> >>>>>I have performed a test to compare the performance of SCST and STGT. >>>>>Apparently the SCST target implementation performed far better than >>>>>the STGT target implementation. This makes me wonder whether this is >>>>>due to the design of SCST or whether STGT's performance can be >>>>>improved to the level of SCST ? >>>>> >>>>>Test performed: read 2 GB of data in blocks of 1 MB from a target (hot >>>>>cache -- no disk reads were performed, all reads were from the cache). >>>>>Test command: time dd if=/dev/sde of=/dev/null bs=1M count=2000 >>>>> >>>>> STGT read SCST read >>>>> performance (MB/s) performance (MB/s) >>>>>Ethernet (1 Gb/s network) 77 89 >>>>>IPoIB (8 Gb/s network) 82 229 >>>>>SRP (8 Gb/s network) N/A 600 >>>>>iSER (8 Gb/s network) 80 N/A >>>>> >>>>>These results show that SCST uses the InfiniBand network very well >>>>>(effectivity of about 88% via SRP), but that the current STGT version >>>>>is unable to transfer data faster than 82 MB/s. Does this mean that >>>>>there is a severe bottleneck present in the current STGT >>>>>implementation ? >>>>> >>>> >>>>I don't know about the details but Pete said that he can achieve more >>>>than 900MB/s read performance with tgt iSER target using ramdisk. >>>> >>>>http://www.mail-archive.com/stg...@li.../msg00004.html >>>> >>> >>>Please don't confuse multithreaded latency insensitive workload with >>>single threaded, hence latency sensitive one. >>> >> >>Seems that he can get good performance with single threaded workload: >> >>http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf >> >> >>But I don't know about the details so let's wait for Pete to comment >>on this. >> >>Perhaps Voltaire people could comment on the tgt iSER performances. > > We didn't run any real performance test with tgt, so I don't have > numbers yet. I know that Pete got ~900 MB/sec by hacking sgp_dd, so all > data was read/written to the same block (so it was all done in the > cache). Pete - am I right? > > As already mentioned, he got that with IB SDR cards that are 10 Gb/sec > cards in theory (actual speed is ~900 MB/sec). With DDR cards (20 > Gb/sec), you can get even more. I plan to test that in the near future. Are you writing about a maximum possible speed which he got, including multithreded tests with many outstanding commands or about speed he got on single threaded reads with one outstanding command? This thread is about the second one. > Erez > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to maj...@vg... > More majordomo info at http://vger.kernel.org/majordomo-info.html > |