From: Hubertus F. <fr...@us...> - 2000-11-20 20:23:39
|
Ananth ..... I read the configurable scheduler white paper from HP. I thought it was a nice attempt to implement conviniently the cpu-affinity scheduling. There shortcomings at that point was that they basically allow to implement a different goodness value calculations only. I talked with the author at some point and he mentioned that he was thinking about multiple runqueues as an addon. The unfortunate think here is that Linus made rather strong negative comments regarding pluggable schedulers. -- Hubertus FROM: Rajagopal Ananthanarayanan DATE: 11/17/2000 13:59:52 SUBJECT: RE: [Lse-tech] Multiple Runqueues Mike Kravetz wrote: > > On Fri, Nov 17, 2000 at 07:55:04PM +0100, Andi Kleen wrote: > > On Fri, Nov 17, 2000 at 10:38:59AM -0800, Mike Kravetz wrote: > > > In my implementation we are trying to emulate the same > > > scheduling decisions that are made by the current Linux > > > scheduler. Because of this, we try to make `global` > > > > Why this emulation ? Do you fear to break existing programs ? > > We simply want to show what would happen if you go to multiple > runqueues. If we significantly change the behavior of the > scheduler, then it would be difficult to say what had the > greater impact: multiple runqueues or the change in behavior. Are there any thoughts about making the behavior of the scheduler configurable? For example, getting the real-time scheduler to work correctly can be tricky with multiple run-queues, since there is no single ordering entity. Relaxing the real-time scheduler to say that "thread with highest priority will be dispatched in < 1 millisecond" gives some leeway in designing distributed queues & their balancing. For harder real-time requirements this may be substituted with more stringent balancing algorithms, at the cost of basic scheduling complexity. Along similar lines there is some work at HP towards a plug-in scheduler. You might want to checkout: http://resourcemanagement.unixsolutions.hp.com/WaRM/schedpolicy.html |
From: Hubertus F. <fr...@us...> - 2000-11-20 20:42:12
|
Well, I used to be a founding member of the K42 team but moved on a couple of years ago to Linux related stuff. K42 people are in the next office. So any particular questions can be dragged down and answered quickly if necessary. In K42 we had a real clean plate regarding scheduling. Doing user level scheduling on top of virtual CPUs is an interesting approach if we think that user level threads are going to be the prevelant scheduling entities. It does however require a very tight integration between the kernel and user level scheduler portions to avoid frequent protection domain crossing and efficient handover of controls. Maintaining a "ghostly" present on each runqueue, would still require examining the state of a particular task thus keeping lock hold time artificially long. I need to be convinced that the load balacing issues can't be solved otherwise. It also requires that one can get enough parallelism into the kernel. I do not think that at this point we can wonder down this road in Linux. We need to maintain backward compatibility and show that Linux can be scaled up with conventional wisdom, such as multiple runqueues or any other ideas that have been tried in the past. -- Hubertus Franke ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- FROM: Daniel Belov DATE: 11/17/2000 11:55:41 SUBJECT: [Lse-tech] Re: Multiple Runqueues hi, I was just talking with some some people here at CMU and a research OS was mentioned, K42 (http://www.research.ibm.com/K42/). They had quite an interesting way for the processes to change from processor to processor. If I understood it correctly, the application itself was responsible for scheduling it`s threads onto different runqueues, by maintaining a "ghostly" presence on every runqueue and figuring out onto which queue excatly should the real "presence" be projected. They supposedly did not get anything "real" enough working but some synthetic tests ran really well. This seems like an idea to try out maybe? dan _______________________________________________ Lse-tech mailing list <EMAIL: PROTECTED> http://lists.sourceforge.net/mailman/listinfo/lse-tech |
From: Duc V. <dvi...@us...> - 2001-05-02 22:35:35
|
Has anyone seen performance degradations between 2.2.19 and 2.4.x when running lmbench? I ran the lmbench benchmark on Linux kernels 2.2.19, 2.4.0, and 2.4.1 and observed performance degradation to be most noticed in signal handling, pipe latency, file deletion, and process creation. Are you aware of any kernel changes introduced in 2.4.x that might cause this performance degradation? The following data are in microseconds, lower is better. Each data point represents the average of at least four runs. Tests Linux 2.2.19 Linux 2.4.0 Linux 2.4.1 Signal handler overhead 1.64 3.77 3.82 Pipe latency 4.58 5.28 5.55 File deletion - 10K 11.48 15.30 15.71 Process fork 114.76 140.45 141.98 Process fork+execve 763.57 834.40 840.39 Notes: 1. The benchmark is lmbench-2beta1. 2. The hardware under test is a 700MHz PIII Xeon. 3. The operating system under test is Red Hat 6.2, running Linux kernels 2.2.19, 2.4.0 and 2.4.1, with 4GB memory support. The following is the summary report generated by the lmbench benchmark. L M B E N C H 2 . 0 S U M M A R Y ------------------------------------ (Alpha software, do not distribute) Basic system parameters ---------------------------------------------------- Host OS Description Mhz --------- ------------- ----------------------- ---- biglinux- Linux 2.2.19 i686-pc-linux-gnu 700 biglinux- Linux 2.2.19 i686-pc-linux-gnu 700 biglinux- Linux 2.2.19 i686-pc-linux-gnu 700 biglinux- Linux 2.2.19 i686-pc-linux-gnu 700 biglinux- Linux 2.4.0 i686-pc-linux-gnu 700 biglinux- Linux 2.4.0 i686-pc-linux-gnu 700 biglinux- Linux 2.4.0 i686-pc-linux-gnu 700 biglinux- Linux 2.4.0 i686-pc-linux-gnu 700 biglinux- Linux 2.4.0 i686-pc-linux-gnu 700 biglinux- Linux 2.4.1 i686-pc-linux-gnu 700 biglinux- Linux 2.4.1 i686-pc-linux-gnu 700 biglinux- Linux 2.4.1 i686-pc-linux-gnu 700 biglinux- Linux 2.4.1 i686-pc-linux-gnu 700 Processor, Processes - times in microseconds - smaller is better ---------------------------------------------------------------- Host OS Mhz null null open selct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ---- biglinux- Linux 2.2.19 700 0.43 0.61 3.89 4.84 20 1.27 1.64 109 761 2988 biglinux- Linux 2.2.19 700 0.43 0.62 3.91 4.89 20 1.27 1.64 108 760 2981 biglinux- Linux 2.2.19 700 0.43 0.62 3.88 4.93 20 1.27 1.64 108 764 2986 biglinux- Linux 2.2.19 700 0.43 0.62 3.79 4.73 22 1.27 1.64 132 767 3011 biglinux- Linux 2.4.0 700 0.40 0.63 3.37 4.45 19 1.21 3.75 139 831 3219 biglinux- Linux 2.4.0 700 0.40 0.60 3.39 4.46 19 1.24 3.75 139 831 3269 biglinux- Linux 2.4.0 700 0.43 0.63 3.39 4.46 21 1.24 3.82 142 841 3255 biglinux- Linux 2.4.0 700 0.43 0.62 3.37 4.49 21 1.24 3.75 140 835 3244 biglinux- Linux 2.4.0 700 0.43 0.62 3.37 4.47 19 1.24 3.75 140 832 3263 biglinux- Linux 2.4.1 700 0.40 0.61 3.37 4.35 19 1.21 3.80 141 836 3262 biglinux- Linux 2.4.1 700 0.40 0.61 3.39 4.42 21 1.21 3.85 142 841 3316 biglinux- Linux 2.4.1 700 0.40 0.59 3.42 4.38 21 1.21 3.81 141 841 3306 biglinux- Linux 2.4.1 700 0.40 0.61 3.39 4.39 20 1.21 3.81 142 841 3225 Context switching - times in microseconds - smaller is better ------------------------------------------------------------- Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw --------- ------------- ----- ------ ------ ------ ------ ------- ------- biglinux- Linux 2.2.19 0.850 5.0800 21 6.7500 23 8.98000 97 biglinux- Linux 2.2.19 0.860 7.1700 21 6.9200 22 10 138 biglinux- Linux 2.2.19 0.890 6.5100 21 6.5600 136 14 155 biglinux- Linux 2.2.19 0.960 6.4200 21 6.7000 22 7.16000 88 biglinux- Linux 2.4.0 1.040 6.5300 21 6.5800 29 19 185 biglinux- Linux 2.4.0 1.170 6.6200 21 6.7000 22 6.72000 102 biglinux- Linux 2.4.0 1.050 6.6100 21 6.5900 22 6.68000 101 biglinux- Linux 2.4.0 1.070 6.5700 21 6.7200 22 6.79000 102 biglinux- Linux 2.4.0 1.100 6.5300 21 6.5900 22 13 107 biglinux- Linux 2.4.1 1.050 6.4400 21 7.2000 23 6.88000 102 biglinux- Linux 2.4.1 1.140 6.6900 21 6.7100 22 6.92000 103 biglinux- Linux 2.4.1 1.130 6.7200 22 6.9300 22 6.85000 110 biglinux- Linux 2.4.1 1.180 6.5000 21 7.1000 22 7.11000 109 *Local* Communication latencies in microseconds - smaller is better ------------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------------- ----- ----- ---- ----- ----- ----- ----- ---- biglinux- Linux 2.2.19 0.850 4.583 8.55 15 25 83 biglinux- Linux 2.2.19 0.860 4.605 8.96 15 25 85 biglinux- Linux 2.2.19 0.890 4.545 8.79 15 25 83 biglinux- Linux 2.2.19 0.960 4.581 8.85 15 25 86 biglinux- Linux 2.4.0 1.040 5.193 8.74 15 22 9.0M biglinux- Linux 2.4.0 1.170 5.274 8.80 15 22 9.0M biglinux- Linux 2.4.0 1.050 5.378 9.02 15 23 23M biglinux- Linux 2.4.0 1.070 5.288 8.99 15 22 3.0M biglinux- Linux 2.4.0 1.100 5.273 8.81 15 23 29M biglinux- Linux 2.4.1 1.050 5.291 8.41 15 23 3.0M biglinux- Linux 2.4.1 1.140 5.419 8.56 15 23 9.0M biglinux- Linux 2.4.1 1.130 5.574 8.81 15 23 29M biglinux- Linux 2.4.1 1.180 5.646 9.01 15 24 9.0M File & VM system latencies in microseconds - smaller is better -------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page Create Delete Create Delete Latency Fault Fault --------- ------------- ------ ------ ------ ------ ------- ----- ----- biglinux- Linux 2.2.19 8.8928 0.5667 17 1.1416 24.57400 0.887 528 biglinux- Linux 2.2.19 8.8976 0.5710 17 1.1458 23.76700 0.887 518 biglinux- Linux 2.2.19 8.9103 0.5625 17 1.1297 23.83100 0.887 518 biglinux- Linux 2.2.19 8.8881 0.5617 17 1.1739 23.80700 0.888 519 biglinux- Linux 2.4.0 9.4500 0.5682 19 1.5225 1097 0.847 3.00000 biglinux- Linux 2.4.0 9.4589 0.5707 19 1.5247 1129 0.850 3.00000 biglinux- Linux 2.4.0 9.4545 0.5724 19 1.5279 1108 0.887 3.00000 biglinux- Linux 2.4.0 9.4661 0.5762 19 1.5340 1104 0.854 3.00000 biglinux- Linux 2.4.0 9.4563 0.5781 19 1.5398 1140 0.850 3.00000 biglinux- Linux 2.4.1 9.5905 0.5969 17 1.5588 1138 0.837 3.00000 biglinux- Linux 2.4.1 9.6089 0.6082 17 1.5774 1140 0.862 3.00000 biglinux- Linux 2.4.1 9.5914 0.5986 17 1.5677 1156 0.835 3.00000 biglinux- Linux 2.4.1 9.6015 0.6109 17 1.5816 1151 0.861 3.00000 *Local* Communication bandwidths in MB/s - bigger is better ----------------------------------------------------------- Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem UNIX reread reread (libc) (hand) read write --------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- ----- biglinux- Linux 2.2.19 684 458 256 219 258 131 129 258 199 biglinux- Linux 2.2.19 684 458 254 191 258 133 130 258 198 biglinux- Linux 2.2.19 685 457 195 219 258 135 129 258 199 biglinux- Linux 2.2.19 694 454 248 219 258 133 129 258 198 biglinux- Linux 2.4.0 651 369 470 146 257 128 129 257 197 biglinux- Linux 2.4.0 641 371 478 209 257 119 129 257 197 biglinux- Linux 2.4.0 647 372 481 209 257 127 128 257 197 biglinux- Linux 2.4.0 634 369 479 209 257 130 129 257 197 biglinux- Linux 2.4.0 641 351 483 210 257 127 129 257 197 biglinux- Linux 2.4.1 650 379 476 209 257 128 129 257 197 biglinux- Linux 2.4.1 622 367 472 209 257 120 128 257 197 biglinux- Linux 2.4.1 615 367 471 209 257 129 129 257 197 biglinux- Linux 2.4.1 624 384 463 209 257 126 129 257 197 Memory latencies in nanoseconds - smaller is better (WARNING - may not be correct, check graphs) --------------------------------------------------- Host OS Mhz L1 $ L2 $ Main mem Guesses --------- ------------- ---- ----- ------ -------- ------- biglinux- Linux 2.2.19 700 4.286 24 201 biglinux- Linux 2.2.19 700 4.286 12 201 biglinux- Linux 2.2.19 700 4.286 12 201 biglinux- Linux 2.2.19 700 4.286 12 201 biglinux- Linux 2.4.0 700 4.286 12 201 biglinux- Linux 2.4.0 700 4.287 12 201 biglinux- Linux 2.4.0 700 4.286 12 201 biglinux- Linux 2.4.0 700 4.287 12 201 biglinux- Linux 2.4.0 700 4.286 12 201 biglinux- Linux 2.4.1 700 4.286 12 201 biglinux- Linux 2.4.1 700 4.287 12 201 biglinux- Linux 2.4.1 700 4.286 12 201 biglinux- Linux 2.4.1 700 4.287 12 201 Cheers .... Duc. |
From: Linus T. <tor...@tr...> - 2001-05-03 00:10:57
|
On Wed, 2 May 2001, Duc Vianney wrote: > > Has anyone seen performance degradations between 2.2.19 and 2.4.x Yes. The signal handling one is because 2.4.x will save off the full SSE2 state, which means that the signal stack is almost 700 bytes, as compared to <200 before. This is sadly necessary to be able to take advantage of the SSE2 instructions - and on special applications the win can be quite noticeable. This one you won't be able to avoid, although you shouldn't see it on older hardware that do not have SSE2 (you see it because you have a PIII). You don't say how much memory you have, but the file handling ones might be due to a really unfortunate hash thinko that cause the dentry hash to be pretty much useless on machines that have 512MB of RAM (it can show up in other cases, but 512M is the case that makes the hash really become a non-hash). If so, it should be fixed in 2.4.2. 2.4.4 will give noticeably better numbers for fork and fork+exec. However, the scheduling optimization that does that actually breaks at least "bash", and it appears that we will just undo it during the stable series. Even if the bug is obviously in user land (and a fix is available), stable kernels shouldn't try to hide the problems. Linus |
From: Prickett, T. O <ter...@in...> - 2001-09-10 22:54:36
Attachments:
patch_data.ZIP
|
Hi, This is a repost of previous mail, as the previous mail appeared to be too large. This email contains performance measurements using Jonathan Lahr's io_request_lock patch and a custom IO load generation program. The test program was a small custom program performing writes much like dd. It can be made available to anyone wanting it. With the patch we have noticed significant improvement in scalability and performance. We have done the testing on 2 different IA32 machines with different configurations. All tests performed writes with a 1MB block size to raw devices (e.g. /dev/raw/rawn). System 1: 4-way SC450NX with 5 SCSI controllers 3 drives per controller. The drives and controllers are a mixture, some slow performance and some faster (note: the fastest controller had 6 drives attached). This machine had 1GB of main memory. System 1 results: The peak performance for the system when IO is done to each adapter separately, is about 142MB/second. When IO is done to all adapters simultaneously using 18 drives we see performance of about 105MB/second (which is about 74% of peak). With the io_request_lock patch, we get about 124MB/Second. This is about 18% improvement and is about 87% of the systems peak performance. We collected lockmeter data and kernel profile data to understand what has changed to get the performance improvements. The lockmeter data for system 1 without the patch follows (this is the abridged edition, the complete data is attached): SPINLOCKS HOLD WAIT UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME 61.8% 63.6% 1.9us( 23us) 12us( 151us)(61.6%) 9989919 36.4% 63.6% 0% io_request_lock 44.1% 30.2% 2.8us( 23us) 8.1us( 135us)( 9.8%) 4900358 69.8% 30.2% 0% __make_request+0xe8 0.24% 84.2% 4.1us( 9.9us) 11us( 84us)(0.13%) 17992 15.8% 84.2% 0% ahc_linux_isr+0x2b4 15.6% 97.0% 1.0us( 2.9us) 13us( 151us)(50.7%) 4900357 3.0% 97.0% 0% blk_get_queue+0x14 0.27% 85.4% 5.5us( 17us) 14us( 97us)(0.14%) 15148 14.6% 85.4% 0% generic_unplug_device+0x10 0.00% 84.8% 1.1us( 1.9us) 9.6us( 35us)(0.00%) 33 15.2% 84.8% 0% ide_do_request+0x2ac 0.00% 19.9% 4.4us( 23us) 7.5us( 41us)(0.00%) 136 80.1% 19.9% 0% ide_end_request+0x18 0.00% 75.8% 5.0us( 9.1us) 9.0us( 27us)(0.00%) 33 24.2% 75.8% 0% ide_intr+0x18 0.00% 3.0% 2.2us( 4.6us) 9.4us( 9.4us)(0.00%) 33 97.0% 3.0% 0% ide_intr+0x144 0.00% 81.8% 2.1us( 3.1us) 8.8us( 37us)(0.00%) 33 18.2% 81.8% 0% ide_set_handler+0x20 0.31% 83.7% 5.3us( 19us) 8.9us( 86us)(0.11%) 18094 16.3% 83.7% 0% scsi_dispatch_cmd+0xdc 0.43% 85.0% 7.9us( 21us) 9.1us( 95us)(0.11%) 16676 15.0% 85.0% 0% scsi_dispatch_cmd+0x12c 0.08% 85.2% 1.3us( 3.0us) 12us( 87us)(0.16%) 18109 14.8% 85.2% 0% scsi_finish_command+0x18 0.06% 14.0% 1.1us( 3.3us) 12us( 66us)(0.02%) 16683 86.0% 14.0% 0% scsi_old_done+0x614 0.22% 85.9% 2.0us( 11us) 11us( 96us)(0.26%) 34792 14.1% 85.9% 0% scsi_queue_next_request+0x18 0.24% 2.4% 2.1us( 9.2us) 9.8us( 66us)(0.01%) 34770 97.6% 2.4% 0% scsi_request_fn+0x31c This data shows that the io_request_lock is highly utilized and highly contended for. So much so, that the lock is being held for 63.6% of the total CPU time (about 76 Second) and the lock is being held for 19 Seconds. Thus for 95 of the 120 available seconds the lock is being held or waited for. (The 120 seconds is 4 CPUS * 30 second lockmeter run). This also can be seen in the kernel profile data below: __make_request [C01978E0]: 4377 blk_get_queue [C0197190]: 3139 default_idle [C0105200]: 1727 end_buffer_io_kiobuf [C0137E20]: 523 brw_kiovec [C0137EF0]: 515 __scsi_end_request [C01CC2C0]: 446 submit_bh [C01980C0]: 239 mcount [C0267FD0]: 162 scsi_init_io_vc [C01CE280]: 136 set_bh_page [C0136C20]: 86 wait_kio [C0137E80]: 82 scsi_init_io_v [C01CDD80]: 77 generic_make_request [C0197F90]: 56 end_kio_request [C014BBE0]: 51 ahc_linux_isr [C01D1760]: 41 scsi_request_fn [C01CC900]: 39 scsi_dispatch_cmd [C01C5D90]: 36 In this data, the CPU's are idle for about 17% of the time with __make_request and blk_get_queue using about 75% of the time. With the io_request_lock patch, lockmeter and kernel profile data show marked improvement in io_request_lock utilization and contention and also in CPU usage. Here is the lockmeter data: SPINLOCKS HOLD WAIT UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME 10.7% 11.1% 0.5us( 397us) 5.1us( 906us)( 3.3%) 7042289 88.9% 11.1% 0% io_request_lock 0.29% 11.6% 3.4us( 9.9us) 22us( 462us)(0.05%) 25582 88.4% 11.6% 0% ahc_linux_isr+0x2b4 4.4% 11.0% 0.2us( 3.3us) 4.9us( 906us)( 3.0%) 6899886 89.0% 11.0% 0% blk_get_queue+0x14 4.3% 52.6% 95us( 397us) 4.7us( 199us)(0.03%) 13825 47.4% 52.6% 0% generic_unplug_device+0x14 0.00% 10.2% 0.4us( 2.1us) 11us( 63us)(0.00%) 59 89.8% 10.2% 0% ide_do_request+0x2ac 0.00% 3.4% 3.8us( 15us) 8.7us( 47us)(0.00%) 175 96.6% 3.4% 0% ide_end_request+0x18 0.00% 3.3% 3.9us( 7.4us) 3.0us( 5.8us)(0.00%) 60 96.7% 3.3% 0% ide_intr+0x18 0.00% 8.3% 1.4us( 4.8us) 2.1us( 4.1us)(0.00%) 60 91.7% 8.3% 0% ide_intr+0x144 0.00% 6.8% 1.3us( 3.6us) 27us( 104us)(0.00%) 59 93.2% 6.8% 0% ide_set_handler+0x20 0.04% 12.6% 0.4us( 2.8us) 11us( 466us)(0.03%) 25631 87.4% 12.6% 0% scsi_finish_command+0x18 0.02% 24.7% 0.4us( 4.2us) 3.7us( 291us)(0.01%) 17113 75.3% 24.7% 0% scsi_old_done+0x614 1.5% 12.3% 11us( 115us) 20us( 473us)(0.09%) 42744 87.7% 12.3% 0% scsi_queue_next_request+0x14 0.16% 12.0% 2.8us( 8.3us) 23us( 447us)(0.04%) 17095 88.0% 12.0% 0% sym53c8xx_intr+0x68 and also: 60.1% 0.00% 2.6us( 25us) 117us( 263us)(0.01%) 6899886 100% 0.00% 0% __make_request+0xf0 The io_request_lock contention and utilization is greatly improved. The total time for the wait of the io_request_lock has now dropped to about 4 seconds. The new lock introduced in the patch is not even contended for. The kernel profile data below: default_idle [C0105200]: 7734 __make_request [C01978E0]: 1914 end_buffer_io_kiobuf [C0137E20]: 403 blk_get_queue [C0197190]: 400 brw_kiovec [C0137EF0]: 378 __scsi_end_request [C01CC270]: 308 submit_bh [C01980C0]: 149 generic_unplug_device [C0197410]: 114 mcount [C0267F60]: 93 generic_make_request [C0197F90]: 71 wait_kio [C0137E80]: 59 end_kio_request [C014BBE0]: 55 set_bh_page [C0136C20]: 48 scsi_queue_next_request [C01CC130]: 45 ahc_linux_isr [C01D16F0]: 30 __free_pages [C012E610]: 29 init_buffer [C01362A0]: 24 map_user_kiobuf [C0121FC0]: 22 tasklet_hi_schedule [C0118EE0]: 19 scsi_io_completion [C01CC4C0]: 18 In this data, the CPU's are idle about 77% (In Idle). Thus they are now free to accomplish other tasks. System 2: 4-way SRPM8 Server Platform with 5 Fibre Channel Adapters each with 4 drives for a total of 20 disks. This system had 512MB of main memory. Improvement was approximately 20% and peaked at about 260 MB/second. The lockmeter and kernel profile data showed much the same as in the SC450NX 4-way. The full profile data and lockmeter data is attached. The file is a gzip tarfile. thanks, terry <<patch_data.ZIP>> |
From: Mala A. <ma...@us...> - 2001-09-19 17:59:48
|
In tcp stack, checksum of user data is done in interrupt context (bottom half) if a receive sock is not pending/being processed. In this case the data is copied to user buffer in process context. Have anyone tried to move checksum off of interrupt context? Checksum in these cases can be deferred along with arming the delay ack timer to process the frame in case a receive_sock is not issued within the timelimit. Determining this timer limit may be very important so that it doesn't break any apps. I would like to know if anyone has worked/thought about this. Any input is appreciated. Regards, Mala Mala Anand E-mail:ma...@us... Linux Technology Center - Performance Phone:838-8088; Tie-line:678-8088 |
From: Zhuang, L. <lou...@in...> - 2002-09-03 10:38:15
|
Dear all I found a careless bug in Read-Copy-Update. --- linux-2.4.18-TLT-cur/fs/proc/base.c Thu Aug 1 00:59:42 2002 +++ linux-2.4.18-TLT-SMP/fs/proc/base.c Tue Sep 3 17:16:10 2002 @@ -593,7 +593,7 @@ task_unlock(p); if (!files) goto out; - read_lock(&files->file_lock); + spin_lock(&files->file_lock); for (fd = filp->f_pos-2; fd < files->max_fds; fd++, filp->f_pos++) { @@ -601,7 +601,7 @@ if (!fcheck_files(files, fd)) continue; - read_unlock(&files->file_lock); + spin_unlock(&files->file_lock); j = NUMBUF; i = fd; @@ -613,12 +613,12 @@ ino = fake_ino(pid, PROC_PID_FD_DIR + fd); if (filldir(dirent, buf+j, NUMBUF-j, fd+2, ino, DT_LNK) < 0) { - read_lock(&files->file_lock); + spin_lock(&files->file_lock); break; } - read_lock(&files->file_lock); + spin_lock(&files->file_lock); } - read_unlock(&files->file_lock); + spin_unlock(&files->file_lock); put_files_struct(files); } out: Louis Zhuang, SW Engineer, Intel Corporation. Opinions expressed are those of the author and do not represent Intel Corporation |
From: Maneesh S. <ma...@in...> - 2002-09-03 11:03:40
|
Hi Zhuang, Looks like you have got some wrong file_struct_rcu patch. Please get the correct patch from lse download page for the required kernel version. http://sourceforge.net/project/lse/ Maneesh On Tue, Sep 03, 2002 at 06:36:26PM +0800, Zhuang, Louis wrote: > Dear all > I found a careless bug in Read-Copy-Update. > > > --- linux-2.4.18-TLT-cur/fs/proc/base.c Thu Aug 1 00:59:42 2002 > +++ linux-2.4.18-TLT-SMP/fs/proc/base.c Tue Sep 3 17:16:10 2002 > @@ -593,7 +593,7 @@ > task_unlock(p); > if (!files) > goto out; > - read_lock(&files->file_lock); > + spin_lock(&files->file_lock); > for (fd = filp->f_pos-2; > fd < files->max_fds; > fd++, filp->f_pos++) { > @@ -601,7 +601,7 @@ > > if (!fcheck_files(files, fd)) > continue; > - read_unlock(&files->file_lock); > + spin_unlock(&files->file_lock); > > j = NUMBUF; > i = fd; > @@ -613,12 +613,12 @@ > > ino = fake_ino(pid, PROC_PID_FD_DIR + fd); > if (filldir(dirent, buf+j, NUMBUF-j, fd+2, > ino, DT_LNK) < 0) { > - read_lock(&files->file_lock); > + spin_lock(&files->file_lock); > break; > } > - read_lock(&files->file_lock); > + spin_lock(&files->file_lock); > } > - read_unlock(&files->file_lock); > + spin_unlock(&files->file_lock); > put_files_struct(files); > } > out: > > > > Louis Zhuang, SW Engineer, Intel Corporation. > Opinions expressed are those of the author and do not represent Intel > Corporation > > > > ------------------------------------------------------- > This sf.net email is sponsored by: OSDN - Tired of that same old > cell phone? Get a new here for FREE! > https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 > _______________________________________________ > Lse-tech mailing list > Lse...@li... > https://lists.sourceforge.net/lists/listinfo/lse-tech -- Maneesh Soni IBM Linux Technology Center, IBM India Software Lab, Bangalore. Phone: +91-80-5044999 email: ma...@in... http://lse.sourceforge.net/ |
From: Dipankar S. <dip...@in...> - 2002-09-03 11:21:06
|
Louis, Your fix is incorrect. The whole point is to not acquire any lock (thereby not modifying the files_struct cache line) while reading it. Looks like what you have is a bad merge. I don't see in the published patches in LSE. This will not even compile. For the latest 2.4 files_struct patch see - [For 2.4.19rc1 + O(1) scheduler + rcu + read_barrier_depends] http://sourceforge.net/project/showfiles.php?group_id=8875&release_id=105110 There read_lock()/read_unlock() is replaced by rcu_read_lock()/rcu_read_unlock(). Or you want only 2.4.17 files_struct_rcu patch, see the 2.4.17 section in the same download page. Thanks -- Dipankar Sarma <dip...@in...> http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. On Tue, Sep 03, 2002 at 06:36:26PM +0800, Zhuang, Louis wrote: > Dear all > I found a careless bug in Read-Copy-Update. > > > --- linux-2.4.18-TLT-cur/fs/proc/base.c Thu Aug 1 00:59:42 2002 > +++ linux-2.4.18-TLT-SMP/fs/proc/base.c Tue Sep 3 17:16:10 2002 > @@ -593,7 +593,7 @@ > task_unlock(p); > if (!files) > goto out; > - read_lock(&files->file_lock); > + spin_lock(&files->file_lock); > for (fd = filp->f_pos-2; > fd < files->max_fds; > fd++, filp->f_pos++) { > @@ -601,7 +601,7 @@ > > if (!fcheck_files(files, fd)) > continue; > - read_unlock(&files->file_lock); > + spin_unlock(&files->file_lock); > > j = NUMBUF; > i = fd; > @@ -613,12 +613,12 @@ > > ino = fake_ino(pid, PROC_PID_FD_DIR + fd); > if (filldir(dirent, buf+j, NUMBUF-j, fd+2, > ino, DT_LNK) < 0) { > - read_lock(&files->file_lock); > + spin_lock(&files->file_lock); > break; > } > - read_lock(&files->file_lock); > + spin_lock(&files->file_lock); > } > - read_unlock(&files->file_lock); > + spin_unlock(&files->file_lock); > put_files_struct(files); > } > out: > > > > Louis Zhuang, SW Engineer, Intel Corporation. > Opinions expressed are those of the author and do not represent Intel > Corporation > > > > ------------------------------------------------------- > This sf.net email is sponsored by: OSDN - Tired of that same old > cell phone? Get a new here for FREE! > https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 > _______________________________________________ > Lse-tech mailing list > Lse...@li... > https://lists.sourceforge.net/lists/listinfo/lse-tech |
From: Mala A. <ma...@us...> - 2002-09-19 22:56:56
|
Andrew Morton wrote ... >It seems that movsl works acceptably with all alignments on AMD >hardware, although this needs to be checked with more recent machines. >movsl is a (bad) loss on PII and PIII for all alignments except 8&8. >Don't know about P4 - I can test that in a day or two. >I expect that a minimal, 90% solution would be just: >fancy_copy_to_user(dst, src, count) >{ > if (arch_has_sane_movsl || ((dst|src) & 7) == 0) > movsl_copy_to_user(dst, src, count); > else > movl_copy_to_user(dst, src, count); >} >and >#ifndef ARCH_HAS_FANCY_COPY_USER >#define fancy_copy_to_user copy_to_user >#endif >and we really only need fancy_copy_to_user in a handful of >places - the bulk copies in networking and filemap.c. For all >the other call sites it's probably more important to keep the >code footprint down than it is to squeeze the last few drops out >of the copy speed. >Mala Anand has done some work on this. See >http://www.uwsg.iu.edu/hypermail/linux/kernel/0206.3/0100.html ><searches> Yes, I have a copy of Mala's patch here which works >against 2.5.current. Mala's patch will cause quite an expansion >of kernel size; we would need an implementation which did not >use inlining. This work was discussed at OLS2002. See >http://www.linux.org.uk/~ajh/ols2002_proceedings.pdf.gz I will move the code from uaccess.h (inline) to usercopy.c (routine) and will post it soon. It is in my list of things to do. Regards, Mala Mala Anand IBM Linux Technology Center - Kernel Performance E-mail:ma...@us... http://www-124.ibm.com/developerworks/opensource/linuxperf http://www-124.ibm.com/developerworks/projects/linuxperf Phone:838-8088; Tie-line:678-8088 |
From: <ex...@ul...> - 2003-08-26 12:47:27
|
This message has been rejected because it has a potentially executable attachment "application.pif" This form of attachment has been used by recent viruses or other malware. If you meant to send this file then please package it up as a zip file and resend it. Best regards http://www.ulimit.com/ -------------------- BEGIN HEADERS --------------------------- Received: from [213.223.246.30] (helo=POSTE54) by az.ulimit.com with esmtp (Exim 4.14) id 19rdDC-000Oiq-UR for int...@fr...; Tue, 26 Aug 2003 05:46:15 -0700 From: <lse...@li...> To: <int...@fr...> Subject: Re: Details Date: Tue, 26 Aug 2003 14:48:50 +0200 X-MailScanner: Found to be clean Importance: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MSMail-Priority: Normal X-Priority: 3 (Normal) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="_NextPart_000_09F39083" Message-Id: <E19...@az...> -------------------- END HEADERS ----------------------------- |
From: <ex...@ul...> - 2003-08-26 17:49:00
|
This message has been rejected because it has a potentially executable attachment "thank_you.pif" This form of attachment has been used by recent viruses or other malware. If you meant to send this file then please package it up as a zip file and resend it. Best regards http://www.ulimit.com/ -------------------- BEGIN HEADERS --------------------------- Received: from [213.223.246.30] (helo=POSTE54) by ny.ulimit.com with esmtp (Exim 4.14) id 19rhZc-000HvN-8n for int...@fr...; Tue, 26 Aug 2003 10:25:41 -0700 From: <lse...@li...> To: <int...@fr...> Subject: Your details Date: Tue, 26 Aug 2003 19:28:15 +0200 X-MailScanner: Found to be clean Importance: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MSMail-Priority: Normal X-Priority: 3 (Normal) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="_NextPart_000_0AF3639A" Message-Id: <E19...@ny...> -------------------- END HEADERS ----------------------------- |
From: <ex...@ul...> - 2003-08-26 18:32:30
|
This message has been rejected because it has a potentially executable attachment "thank_you.pif" This form of attachment has been used by recent viruses or other malware. If you meant to send this file then please package it up as a zip file and resend it. Best regards http://www.ulimit.com/ -------------------- BEGIN HEADERS --------------------------- Received: from [213.223.246.30] (helo=POSTE54) by ny.ulimit.com with esmtp (Exim 4.14) id 19riba-000DJZ-6m for int...@fr...; Tue, 26 Aug 2003 11:31:46 -0700 From: <lse...@li...> To: <int...@fr...> Subject: Re: Thank you! Date: Tue, 26 Aug 2003 20:34:21 +0200 X-MailScanner: Found to be clean Importance: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MSMail-Priority: Normal X-Priority: 3 (Normal) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="_NextPart_000_0B2FE7E9" Message-Id: <E19...@ny...> -------------------- END HEADERS ----------------------------- |
From: Dick W. <eba...@ya...> - 2004-10-29 09:08:26
|
Hi again, Here is Dick Webber. I write to you because we are accepting your mortgage= application. Our office confirms you can get a $220.000 lo=C0n for a $252.00 per month = payment. Approval process will take 1 minute, so please fill out the form on our we= bsite: http://custodian-broach.refitalk.com Thank you. Best Regards Dick Webber First Account Manager |
From: Gwladus P. <gwl...@li...> - 2005-10-17 23:34:22
|
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>404 Not Found</TITLE> </HEAD><BODY> <H1>Not Found</H1> The requested URL was not found on this server.<P> <HR> <ADDRESS>Apache/1.3.31</ADDRESS> </BODY></HTML> |
From: Krstina G. <KrstinaGumbert@METROTV.COM> - 2007-08-09 16:37:10
|
LIVE FROM THE STREET! Sym: (P R T H) Price: .088 Announces the Opening of Two New Stores by (P.I.NKSH.E.ETS: P R T H) is pleased to announce that Puerto Rico 7, Inc. has opened two new stores. The stores are recorded as Pinero II and Borinquen Towers. Both locations were researched demographically to deliver above average sales due to high traffic streets and communities directly surrounding the stores. The Management team believes that the stores will each quickly reach an annualized run rate of 1.2 Million dollars of sales. IMAGINE IF YOU HAD THE CHANCE TO BUY A WAL-MART FRANCHISE IN MEXICO RIGHT WHEN IT FIRST OPENED ITS DOORS THERE AND ALL YOU NEEDED WAS A SMALL STAKE TO GET IN. Hurry, we see this stock starting to make the turn NOW. |