From: Maneesh S. <sma...@se...> - 2001-05-31 12:09:21
|
Performance effects of zero-copy patch -------------------------------------- While measuring performance of read-copy mechanism and scalable FD management patch in 2.4.4, I found that the chat[1] benchmark's average throughput is _less_ (~11%) then it used to be in 2.4.2. While narrowing down, the problem appears to be the zero-copy patch which was introduced in 2.4.4pre3 release. The numbers for 2.4.4pre2 and 2.4.4pre3 are as below. The numbers are taken over 50 runs of the test. The tests were run with server and clients running on the same machine. The machine under test is a 4-way PIII Xeon with 1MB L2 Cache and 1GB RAM. Mean Standard Deviation ==> 2.4.4pre2 <== Throughput 247565.408163 9455.052520 Real 16.346939 0.626732 System 33.124694 1.378236 User 1.040408 0.118198 CPU 208.469388 1.608597 ==> 2.4.4pre3 <== Throughput 218249.040816 8672.886637 Real 18.730000 0.732203 System 36.602041 1.464336 User 1.088163 0.112021 CPU 200.612245 1.238553 There is significant drop in Throughput and increase in system time and the real time in pre3. Does zero-copy patch has an adverse effect when server and clients are on the same machine and provide better performance only in case of networked environment? I am still looking into it, but if somebody has faced such problem related to zero-copy patch earlier, can help me. [1]: Chat benchmark is a chat room benchmark written by Bill Hartner (http://lbs.sourceforge.net/chat/chat-1.0.1.tar.gz) Regards, Maneesh -- Maneesh Soni IBM Linux Technology Center, IBM India Software Lab, Bangalore. Phone: +91-80-5262355 Extn. 2717 email: sma...@se... http://lse.sourceforge.net/locking/rclock.html |
From: David S. M. <da...@re...> - 2001-05-31 12:39:23
|
Maneesh Soni writes: > While measuring performance of read-copy mechanism and scalable FD management > patch in 2.4.4, I found that the chat[1] benchmark's average throughput is > _less_ (~11%) then it used to be in 2.4.2. While narrowing down, the problem > appears to be the zero-copy patch which was introduced in 2.4.4pre3 release. Does this chat room benchmark or the server it talks to do tiny read/writes? If so, well the previous kernel preformed better moreso by luck than anything else, and the performance is really limited by the application in this case. Also, does the benchmark or the server turn on the TCP_NODELAY socket option? It might be interesting to see some network traces when this thing runs to make sure we're not doing something truly stupid over the wire. I mean, a benchmark of a chat protocol is a complete oxymoron. Chat room data patterns and TCP performance are at direct odds with each other. Later, David S. Miller da...@re... |
From: Maneesh S. <sma...@se...> - 2001-05-31 13:55:19
|
On Thu, May 31, 2001 at 05:39:18AM -0700, David S. Miller wrote: > > Maneesh Soni writes: >> While measuring performance of read-copy mechanism and scalable FD management >> patch in 2.4.4, I found that the chat[1] benchmark's average throughput is >> _less_ (~11%) then it used to be in 2.4.2. While narrowing down, the problem >> appears to be the zero-copy patch which was introduced in 2.4.4pre3 release. > > Does this chat room benchmark or the server it talks to do tiny > read/writes? It uses sockets and uses send() and recv() for sending and receiving the messages with size of 100 btyes. > If so, well the previous kernel preformed better moreso by > luck than anything else, and the performance is really limited by the > application in this case. > Also, does the benchmark or the server turn on the TCP_NODELAY socket > option? No, it creates PF_INET sockets of type SOCK_STREM. I will try with this option also and check the results. > It might be interesting to see some network traces when this thing > runs to make sure we're not doing something truly stupid over the > wire. I actually ran the server and the clients on the _same_ machine and right now do not have any test results while running it across the network. > I mean, a benchmark of a chat protocol is a complete oxymoron. Chat > room data patterns and TCP performance are at direct odds with each > other. I was just comparing two similar cases and trying to find the reasons in drop in performance. Does smaller packet size affect the zero-copy performance? Regards, Maneesh -- Maneesh Soni IBM Linux Technology Center, IBM India Software Lab, Bangalore. Phone: +91-80-5262355 Extn. 2717 email: sma...@se... http://lse.sourceforge.net/locking/rclock.html |