From: Rick L. <ric...@us...> - 2001-11-16 01:55:58
|
I guess there's no easy way to dance around this, so let's just be blunt :) To address a performance issue, I'm contemplating a change that would change the size of a struct inode (make it larger). Would this be likely to cause lots of cache issues (has struct inode been carefully groomed to be cache-aligned "just so"?) While changes to struct inode *might* cause some binary compatibility issues (problems with 3rd party binary modules?), I should think it shouldn't cause anything more than a recompile for most people. Still, would people see a change of this sort as a 2.4 or 2.5 change? (Seems like 2.4-ok to me, but then I'm not attuned to what sorts of changes push things into 2.5.) Rick |
From: Christoph H. <hc...@ca...> - 2001-11-16 10:36:15
|
On Thu, Nov 15, 2001 at 05:54:35PM -0800, Rick Lindsley wrote: > I guess there's no easy way to dance around this, so let's just be blunt :) > > To address a performance issue, I'm contemplating a change that would > change the size of a struct inode (make it larger). Would this be likely to > cause lots of cache issues (has struct inode been carefully groomed to > be cache-aligned "just so"?) While changes to struct inode *might* > cause some binary compatibility issues (problems with 3rd party binary > modules?), I should think it shouldn't cause anything more than a recompile > for most people. Still, would people see a change of this sort as a 2.4 > or 2.5 change? (Seems like 2.4-ok to me, but then I'm not attuned to > what sorts of changes push things into 2.5.) Why do you want to change it? Support for a binary only filesystem? Christoph -- Of course it doesn't work. We've performed a software upgrade. |
From: Andi K. <ak...@su...> - 2001-11-16 10:42:21
|
On Thu, Nov 15, 2001 at 05:54:35PM -0800, Rick Lindsley wrote: > To address a performance issue, I'm contemplating a change that would > change the size of a struct inode (make it larger). Would this be likely to > cause lots of cache issues (has struct inode been carefully groomed to > be cache-aligned "just so"?) While changes to struct inode *might* > cause some binary compatibility issues (problems with 3rd party binary > modules?), I should think it shouldn't cause anything more than a recompile > for most people. Still, would people see a change of this sort as a 2.4 > or 2.5 change? (Seems like 2.4-ok to me, but then I'm not attuned to > what sorts of changes push things into 2.5.) iirc the plans for 2.5 are to make struct inode smaller not bigger. If you need more data allocate a secondary structure and put a pointer to it into your fs' private inode union part. -Andi |
From: <n0...@in...> - 2001-11-16 15:24:05
|
Rick- Yes, this would definitely cause problems. I know for a fact that this kind of change would break AFS and would require more than just a recompile. (AFS maintains it's own private `inode' structure that has to be kept in sync with the kernel's version - don't blame me, it's not my fault, I just had to fix this one time). At best this would be a 2.5 kind of thing (although, as Andi points out, 2.5 is going to a smaller inode structure so you might have more problems there). On Thu, Nov 15, 2001 at 05:54:35PM -0800, Rick Lindsley wrote: > I guess there's no easy way to dance around this, so let's just be blunt :) > > To address a performance issue, I'm contemplating a change that would > change the size of a struct inode (make it larger). Would this be likely to > cause lots of cache issues (has struct inode been carefully groomed to > be cache-aligned "just so"?) While changes to struct inode *might* > cause some binary compatibility issues (problems with 3rd party binary > modules?), I should think it shouldn't cause anything more than a recompile > for most people. Still, would people see a change of this sort as a 2.4 > or 2.5 change? (Seems like 2.4-ok to me, but then I'm not attuned to > what sorts of changes push things into 2.5.) > > Rick > > _______________________________________________ > Lse-tech mailing list > Lse...@li... > https://lists.sourceforge.net/lists/listinfo/lse-tech -- Don Dugger "Censeo Toto nos in Kansa esse decisse." - D. Gale n0...@in... Ph: 303/652-0870x117 |
From: Andi K. <ak...@su...> - 2001-11-16 15:38:16
|
On Fri, Nov 16, 2001 at 08:23:19AM -0700, n0...@in... wrote: > Rick- > > Yes, this would definitely cause problems. I know for a fact that > this kind of change would break AFS and would require more than > just a recompile. (AFS maintains it's own private `inode' structure > that has to be kept in sync with the kernel's version - don't blame > me, it's not my fault, I just had to fix this one time). I had to fix this once to. I would not worry too much about AFS in this case; it is broken and someday hopefully soon they have to fix it properly. -Andi |
From: Christoph H. <hc...@ca...> - 2001-11-16 16:44:30
|
On Fri, Nov 16, 2001 at 08:23:19AM -0700, n0...@in... wrote: > Rick- > > Yes, this would definitely cause problems. I know for a fact that > this kind of change would break AFS and would require more than > just a recompile. (AFS maintains it's own private `inode' structure > that has to be kept in sync with the kernel's version - don't blame > me, it's not my fault, I just had to fix this one time). That is fundamentally broken - but if AFS hasn't changed much in the last month it is much more broken than just that anyway.. > At best this would be a 2.5 kind of thing (although, as Andi points > out, 2.5 is going to a smaller inode structure so you might have more > problems there). WHAT field do you want to add, actually? Christoph P.S. the right list for that is -fsdevel.. -- Of course it doesn't work. We've performed a software upgrade. |
From: Rick L. <ric...@us...> - 2001-11-16 19:07:46
|
Thanks for all of your responses. Yes, -fsdevel is probably the right place to finish this discussion, but I wanted to take start it here in lse because it's actually SMP related. A file-lock-intensive benchmark brought to my attention that the BKL is currently used to guard i_flock. Without arguing about the merits of this particular benchmark, it seems to me that simply from inspection, replacing the BKL here would be a good thing. A per-inode spinlock would give better granularity than a global one which will cause blockage across the system on every lock attempt by any process. I've given some thought to how to improve on that, and come up with to a) reducing use of kernel_flag elsewhere b) replacing kernel_flag with another global spinlock c) replacing kernel_flag with a global read/write lock d) replacing kernel_flag with a new lock in struct inode e) revisiting the algorithm, and all locking associated therein a) is far more work than necessary to fix this problem. b) through d) are all possibilities but since this hasn't shown up before, I'd conclude that all the contention this benchmark is seeing really is centered right around i_flock. My hunch is that the best solution is d), but it's possible that c) could actually provide "enough" improvements to allow d) to be postponed. Unfortunately, c) may introduce more troubles than it's worth, because in this particular example, I suspect that i_flock is NOT read mostly, write occasionally. Upgrading from a read to a write can't be done atomically so what you may gain in performance you may lose in "supportability" as the code grows in complexity. Both b) and c) cause serialization across every cpu in the system by using a global lock, but d) would cause serialization *per inode* and thus almost guarantee less contention. Assuming, of course, mucking with the inode structure doesn't cause too many other ripples, which is why I asked the question. Doing e) almost certainly puts it into the 2.5 timespace, but not 100% certainly, I suppose. Before I dig too deep into some test patches I thought I'd test the waters among the folks here in LSE. It's good to hear that the inode is being redesigned for 2.5; a spinlock (or two) which guards elements of the inode structure would be very helpful in the new design. If there were one to usurp here I'd include that in my options, but all we have is semaphores right now. Rick |
From: Andrew M. <ak...@zi...> - 2001-11-16 20:26:25
|
Matthew Wilcox is the new owner of fs/locks.c. He'll be interested. About six months back we had a _big_ problem with Apache throughput. On 8-way x86 Apache throughput almost halved because someone removed the BKL from a path in the file locking code. Apache uses flock()-based synchronisation. Removing the BKL had turned a short spin into a semaphore schedule(), which hurt big-time. I did a bunch of maitenance work against fs/locks.c at the time to set things back right. IIRC I moved the BKL to a lower level in the flock() codepath. At one point I did have a super-scalable implementation which used a new per-inode spinlock for the exclusion. It worked and was just fine. But Linus and I agreed that it was a larger-than-necessary change, that sys_flock() contention was not a likely scenario, and that sticking with the BKL approach was a safer path. FWIW, the super-scalable flocking patch against 2.4.0-test10 is at http://www.zip.com.au/~akpm/threaded-locks-sem.patch Rick Lindsley wrote: > > Thanks for all of your responses. Yes, -fsdevel is probably the right > place to finish this discussion, but I wanted to take start it here in > lse because it's actually SMP related. > > A file-lock-intensive benchmark brought to my attention that the BKL is > currently used to guard i_flock. Without arguing about the merits of > this particular benchmark, it seems to me that simply from inspection, > replacing the BKL here would be a good thing. A per-inode spinlock > would give better granularity than a global one which will cause > blockage across the system on every lock attempt by any process. I've > given some thought to how to improve on that, and come up with to > > a) reducing use of kernel_flag elsewhere > b) replacing kernel_flag with another global spinlock > c) replacing kernel_flag with a global read/write lock > d) replacing kernel_flag with a new lock in struct inode > e) revisiting the algorithm, and all locking associated therein > > a) is far more work than necessary to fix this problem. b) through d) > are all possibilities but since this hasn't shown up before, I'd > conclude that all the contention this benchmark is seeing really is > centered right around i_flock. My hunch is that the best solution is > d), but it's possible that c) could actually provide "enough" > improvements to allow d) to be postponed. Unfortunately, c) may > introduce more troubles than it's worth, because in this particular > example, I suspect that i_flock is NOT read mostly, write occasionally. > Upgrading from a read to a write can't be done atomically so what you > may gain in performance you may lose in "supportability" as the code > grows in complexity. > > Both b) and c) cause serialization across every cpu in the system by > using a global lock, but d) would cause serialization *per inode* and > thus almost guarantee less contention. Assuming, of course, mucking > with the inode structure doesn't cause too many other ripples, which is > why I asked the question. Doing e) almost certainly puts it into the > 2.5 timespace, but not 100% certainly, I suppose. Before I dig too deep > into some test patches I thought I'd test the waters among the folks > here in LSE. > > It's good to hear that the inode is being redesigned for 2.5; a > spinlock (or two) which guards elements of the inode structure would be > very helpful in the new design. If there were one to usurp here I'd > include that in my options, but all we have is semaphores right now. > > Rick > > _______________________________________________ > Lse-tech mailing list > Lse...@li... > https://lists.sourceforge.net/lists/listinfo/lse-tech |
From: Matthew W. <wi...@de...> - 2001-11-16 20:44:03
|
> Rick Lindsley wrote: > > A file-lock-intensive benchmark brought to my attention that the BKL is > > currently used to guard i_flock. Without arguing about the merits of > > this particular benchmark, it seems to me that simply from inspection, > > replacing the BKL here would be a good thing. A per-inode spinlock > > would give better granularity than a global one which will cause > > blockage across the system on every lock attempt by any process. I've > > given some thought to how to improve on that, and come up with to No arguments that the BKL needs to be removed here. I did exactly that at one point. I want to know about a real-world application which makes heavy use of file locks and might conceivably have its performance negatively impacted by a spinlock. Apache actually doesn't count. It should be recompiled to not use file locks for synchronisation; with linux 2.4, the wake-one semantics on normal wait queues will give it better performance. > > Both b) and c) cause serialization across every cpu in the system by > > using a global lock, but d) would cause serialization *per inode* and > > thus almost guarantee less contention. Not necessarily. Consider the case of a database using range locks on a big file. Per inode has no benefits compared to global. > > why I asked the question. Doing e) almost certainly puts it into the > > 2.5 timespace, but not 100% certainly, I suppose. Before I dig too deep > > into some test patches I thought I'd test the waters among the folks > > here in LSE. Er, hello? 2.4 is suppoed to be being STABILISED, not being REWRITTEN. If more people were concerned with this, we might be almost finished with 2.5 by now. I've been delaying my rewrite because of this. Seeing people like you ignore it really pisses me off. -- Revolutions do not require corporate support. |
From: Gerrit H. <ge...@us...> - 2001-11-16 21:49:26
|
Hi Matthew, Rick was not planning on rewriting the subsystem - he merely asked for advice on how to begin on this one. This particular problem showed up from a Netbench run on a 4-way. About 24% of the kernel time was from spinning on kernel_flag, with 10% of the total time specifically originating in posix_lock_file. I know you aren't looking for corporate support or anything ;-) but we __are__ trying to get Linux ready to sell on 4-cpu, 8-cpu, 16-cpu, 128-cpu, etc. machines sometime before the end of the next millenia. I know 2.5 will be ready to ship in 6 months, right? In the meantime, we have people who want Linux on their desktop, on the mail server, and on the corporate big systems. We are trying to run their workloads, simulators of their workloads, etc. to find bottlenecks and provide patches. Some of these are destined for 2.5/2.6 (e.g. 1-3 years from being included on a corporate server) but we are trying to find some smaller changes that give us a big bang for the buck. As an example, the LSE group has written & scavenged a set of about 10-15 patches thus far that improve the throughput on a large SMP machine by up to 6X for some workloads, and increaing idle time by nearly 8X. This use of kernel_flag was identified as yet another place where we can squeeze some more numbers, often in conjunction with other patches that we have in place. One thing that is becoming crystal clear is that while some patches may introduce only a moderate improvement on some workloads (e.g. the current MQ scheduler), when combined with other patches, the combined improvement is much greater than the sum of the parts. We are looking for generally small patches for 2.4 when possible (with the expectation that 2.5 will HAVE to be better ;-), hence Rick's request. Do you have any insight on how to approach this particular problem with an eye towards a solution that might easily be backported to 2.4 (and in the meantime, tested on 2.4)? gerrit > Er, hello? 2.4 is suppoed to be being STABILISED, not being REWRITTEN. > If more people were concerned with this, we might be almost finished > with 2.5 by now. > > I've been delaying my rewrite because of this. Seeing people like you > ignore it really pisses me off. > > -- > Revolutions do not require corporate support. |
From: Matthew W. <wi...@de...> - 2001-11-16 22:06:37
|
On Fri, Nov 16, 2001 at 01:48:59PM -0800, Gerrit Huizenga wrote: > This particular problem showed up from a Netbench run on a 4-way. > About 24% of the kernel time was from spinning on kernel_flag, with > 10% of the total time specifically originating in posix_lock_file. That's great. Now, what application or class of applications does Netbench represent? I seriously want to fix file locking, but I want something to point at to justify changes. And a benchmark doesn't represent that unless it's a good simulator of a real world situation. > I know you aren't looking for corporate support or anything ;-) but > we __are__ trying to get Linux ready to sell on 4-cpu, 8-cpu, 16-cpu, > 128-cpu, etc. machines sometime before the end of the next millenia. I know, and I think you're going about it in the wrong way. I do support removing use of lock_kernel from locks.c and I'm still annoyed akpm added it back. > We are looking for generally small patches for 2.4 when possible (with > the expectation that 2.5 will HAVE to be better ;-), hence Rick's > request. > > Do you have any insight on how to approach this particular problem > with an eye towards a solution that might easily be backported to 2.4 > (and in the meantime, tested on 2.4)? I have done an analysis of where we can use a spinlock instead of the BKL; that's not on a system I can get access to for the next week, but I'll dig it up when I have a chance. It's a patch I would have submitted if I didn't feel so strongly about stabilising 2.4, and I didn't have largescale changes planned for 2.5. -- Revolutions do not require corporate support. |
From: Gerrit H. <ge...@us...> - 2001-11-16 22:55:02
|
> On Fri, Nov 16, 2001 at 01:48:59PM -0800, Gerrit Huizenga wrote: > > This particular problem showed up from a Netbench run on a 4-way. > > About 24% of the kernel time was from spinning on kernel_flag, with > > 10% of the total time specifically originating in posix_lock_file. > > That's great. Now, what application or class of applications does > Netbench represent? I seriously want to fix file locking, but I > want something to point at to justify changes. And a benchmark doesn't > represent that unless it's a good simulator of a real world situation. This was the basis for a set of filesystem throughput and performance tests. It turns out that journaling filesystems hit this harder. There are entire classes of custom high end applications that rely on user level files and filesystem throughput for their overall performance. I believe, for instance, BAAN was one major corporate app that did this, there are tons of others at the high end. Unfortunately, I don't know if we can disclose to you all of the individual proprietary or custom workloads that we hear about; the best we can do is try to roll up behaviors and analyse subsystems to improve those that are hottest. And no benchmark in the world is a good simulator of a real world situation. So are you presenting me with a no-win game here? Or would you like to tell me what applications the Fortune 1000 are using that involve Linux on N-way systems? We deal with customer situations where they won't even tell us, but they'll point us at some NDA numbers that point out a bad filesystem. We are trying to aggregate that into a smaller set of benchmarks when possible. > > I know you aren't looking for corporate support or anything ;-) but > > we __are__ trying to get Linux ready to sell on 4-cpu, 8-cpu, 16-cpu, > > 128-cpu, etc. machines sometime before the end of the next millenia. > > I know, and I think you're going about it in the wrong way. I do support > removing use of lock_kernel from locks.c and I'm still annoyed akpm > added it back. Okay, spell out for me what the right way is. I (and IBM and the LSE gorup in general) are more than willing to take community input (heck we've already taken an enormous amount of input, flak, support, reviews, flames, etc. ;-) but share your thoughts about how we should go about it the right way. I would love to know. Usually I ask that question and I get silence or some good comments suggesting that we *are* doing it the right way. > > We are looking for generally small patches for 2.4 when possible (with > > the expectation that 2.5 will HAVE to be better ;-), hence Rick's > > request. > > > > Do you have any insight on how to approach this particular problem > > with an eye towards a solution that might easily be backported to 2.4 > > (and in the meantime, tested on 2.4)? > > I have done an analysis of where we can use a spinlock instead of the BKL; > that's not on a system I can get access to for the next week, but I'll > dig it up when I have a chance. It's a patch I would have submitted > if I didn't feel so strongly about stabilising 2.4, and I didn't have > largescale changes planned for 2.5. If you make it available, we can include it with internal testing, which includes Cerberus testing, STP testing, it can be included in the LSE roll-up patch, it can be tested with Netbench on 2-4-8-way systems, etc. We are trying to bring a few resources to bear to help out in both making Linux more stable and making it scale better. > Revolutions do not require corporate support. The revolution is nearly over. We are about to become victims of our own success. gerrit |
From: Matthew W. <wi...@de...> - 2001-11-16 23:22:07
|
On Fri, Nov 16, 2001 at 02:54:37PM -0800, Gerrit Huizenga wrote: > And no benchmark in the world is a good simulator of a real world > situation. So are you presenting me with a no-win game here? Or > would you like to tell me what applications the Fortune 1000 are > using that involve Linux on N-way systems? We deal with customer > situations where they won't even tell us, but they'll point us at > some NDA numbers that point out a bad filesystem. We are trying > to aggregate that into a smaller set of benchmarks when possible. I'm not trying to present you with a no-win game. I want to know there's a real-world application whose performance on an N-cpu system is going to be measurably affected by the choices we make. > > I know, and I think you're going about it in the wrong way. I do support > > removing use of lock_kernel from locks.c and I'm still annoyed akpm > > added it back. > > Okay, spell out for me what the right way is. I (and IBM and the LSE > gorup in general) are more than willing to take community input (heck > we've already taken an enormous amount of input, flak, support, reviews, > flames, etc. ;-) but share your thoughts about how we should go about > it the right way. I would love to know. Usually I ask that question > and I get silence or some good comments suggesting that we *are* doing > it the right way. I think that scaling the kernel above 8 CPUs is the wrong approach. I prefer the ccCluster approach of providing a single system image running on top of tightly coupled 4-CPU nodes. Unfortunately, I have no time to work on this. > > Revolutions do not require corporate support. > > The revolution is nearly over. We are about to become victims > of our own success. I always find it interesting when people tell me in which of the 4 possible senses they take my sig :-) -- Revolutions do not require corporate support. |
From: Gerrit H. <ge...@us...> - 2001-11-16 23:45:32
|
Aha, the clusters argument. Good for some solutions, terrible for most real-world databases. You can argue for them but over the past 15-20 years, most attempts at doing clustered databases tend to fail. The intercommunication/sharing costs are way, way too high for shared databases uses. Oracle Parallel Server once upon a time tried to be cluster aware. Customers found that it reeked as a solution. Throughput, failover costs, manageability, etc. just wouldn't work for those types of applications. Informix went heavily towards an MPP software model but they still rely on shared memory - and I haven't yet seen a message passing interface come close to what a major hardware vendor can do with a good hardware design for SMP. However, that's not to say that clustering is out and out wrong. IBM, for instance, is offering clustering packages (what was it, up to 10,000 nodes?) running Linux. Workloads need to be partitionable and not to have unreasonble update/synchronization requirements. Works well for replicated servers where all data can fit on a single server or be reasonably partitioned. Also works well for most scientific workloads. Doesn't work well for the corporate database that contains all of the inventory for the company, updated as sales happen and as new inventory is released, especially when that is a large store (Nordstrom's, Mervyn's, Burlington Coat) or a large chain of stores. People like Boeing need to keep track of all parts in real time - they were once using 32 processor systems in two node clusters with roughly a Terabyte of storage per node and they were hoping that would last for a couple of years, as of 3-4 years ago. However, to be able to do terrabyte per second IOs to a shared database with thousands of users reading/updating simultaneously, the replication and data integrity (locking), large, shared SMP tends to do much better. Also, common clustering solutions that often share most of memory would prefer a hardware/OS locking protocol to keep memory synchronized on large machines like SMP much better than clustering. > I'm not trying to present you with a no-win game. I want to know there's > a real-world application whose performance on an N-cpu system is going > to be measurably affected by the choices we make. Yes, there are. Lots of them. How do you want me to convince you of that? ;-) > > > Revolutions do not require corporate support. > > > > The revolution is nearly over. We are about to become victims > > of our own success. > > I always find it interesting when people tell me in which of the 4 possible > senses they take my sig :-) So have you ever shared *your* interpretation? <grin> gerrit |
From: Matthew W. <wi...@de...> - 2001-11-17 03:12:57
|
On Fri, Nov 16, 2001 at 03:45:11PM -0800, Gerrit Huizenga wrote: > > Aha, the clusters argument. Good for some solutions, terrible for > most real-world databases. You can argue for them but over the past You're talking about traditional clusters, not ccClusters. See Larry McVoy's papers on the subject. -- Revolutions do not require corporate support. |
From: Gerrit H. <ge...@us...> - 2001-11-17 19:09:59
|
Gotcha. What do you think the chances are of that being in 2.5, e.g. available to commercial usage within the next 2-3 years through a major distro? gerrit > On Fri, Nov 16, 2001 at 03:45:11PM -0800, Gerrit Huizenga wrote: > > > > Aha, the clusters argument. Good for some solutions, terrible for > > most real-world databases. You can argue for them but over the past > > You're talking about traditional clusters, not ccClusters. See > Larry McVoy's papers on the subject. > > -- > Revolutions do not require corporate support. > > |
From: Matthew W. <wi...@de...> - 2001-11-17 23:03:05
|
On Sat, Nov 17, 2001 at 11:07:36AM -0800, Gerrit Huizenga wrote: > > Gotcha. What do you think the chances are of that being in 2.5, > e.g. available to commercial usage within the next 2-3 years through > a major distro? Larry estimated 2 years to completion, but the plan he outlined produces a usable system for some workloads (ie the easy ones) in about 6 months. The genius of it is that it requires no controversial changes to the kernel -- it's almost all isolated to device drivers & a filesystem. -- Revolutions do not require corporate support. |
From: Anton B. <an...@sa...> - 2001-11-16 23:39:12
|
> This particular problem showed up from a Netbench run on a 4-way. > About 24% of the kernel time was from spinning on kernel_flag, with > 10% of the total time specifically originating in posix_lock_file. Since samba has to run on many platforms by default it has to sacrifice some performance. For SMP locking tdb uses fcntl locks which work on most OS's but stress the locking subsystem quite heavily. The real solution to your problem is not to rearchitect the locking system but to compile samba with the --spinlocks which replace the fcntl locks with userspace spinlocks. Anton |
From: Gerrit H. <ge...@us...> - 2001-11-16 23:52:31
|
Good input, Anton - thanks. We'll see if we can recompile and test that way. Of course, that will move the bottleneck somewhere else, but that is fine. <grin> gerrit > > This particular problem showed up from a Netbench run on a 4-way. > > About 24% of the kernel time was from spinning on kernel_flag, with > > 10% of the total time specifically originating in posix_lock_file. > > Since samba has to run on many platforms by default it has to sacrifice > some performance. For SMP locking tdb uses fcntl locks which work on > most OS's but stress the locking subsystem quite heavily. > > The real solution to your problem is not to rearchitect the locking > system but to compile samba with the --spinlocks which replace the > fcntl locks with userspace spinlocks. > > Anton |
From: Anton B. <an...@sa...> - 2001-11-17 00:16:30
|
Hi Gerrit, > Good input, Anton - thanks. We'll see if we can recompile and > test that way. Of course, that will move the bottleneck somewhere > else, but that is fine. <grin> I forgot to mention, if this isnt sparc, ppc, intel or mips then you will need to write a small spinlock stub in source/tdb/spinlock.c. I'd be interested in the benchmark results you come up with. Are you interested only in kernel performance under load or the benchmark result as well? I have some old samba performance patches I never merged that I can pass on. Anton |
From: Andrew T. <hab...@us...> - 2001-11-18 23:29:43
|
> > Good input, Anton - thanks. We'll see if we can recompile and > > test that way. Of course, that will move the bottleneck somewhere > > else, but that is fine. <grin> > > I forgot to mention, if this isnt sparc, ppc, intel or mips then you > will need to write a small spinlock stub in source/tdb/spinlock.c. Anton, it's on a Intel 4-way in Austin; similar config you saw last summer. I am aware of the smb spinlock option, and I intend on running with it on Monday. I did run it before, and the results were maybe 1-2% better (ext2). I did not revisit this until recently, where lockmeter showed at least 2x more spin (than ext2) on kernel_flag from posix_lock_file() on ext3/jfs/reiserfs/xfs. If I can get the RAID back in order, I'll run on ext2 and ext3 with/without lockmeter. > I'd be interested in the benchmark results you come up with. Are you > interested only in kernel performance under load or the benchmark result > as well? I have some old samba performance patches I never merged that > I can pass on. Of course I'm always interested in the results :) I am running your sendfile patch, but didn't know you had others. I should be able to test anything you have... -Andrew |
From: Matthew W. <wi...@de...> - 2001-11-17 02:32:42
|
On Sat, Nov 17, 2001 at 10:32:41AM +1100, Anton Blanchard wrote: > Since samba has to run on many platforms by default it has to sacrifice > some performance. For SMP locking tdb uses fcntl locks which work on > most OS's but stress the locking subsystem quite heavily. > > The real solution to your problem is not to rearchitect the locking > system but to compile samba with the --spinlocks which replace the > fcntl locks with userspace spinlocks. And I believe (please correct me if I'm wrong) that this would not benefit from a per-inode spinlock rather than a global spinlock, since all the locking is on a single file. -- Revolutions do not require corporate support. |