From: Jes S. <je...@sg...> - 2006-05-02 09:45:20
|
Matt Helsley wrote: > On Mon, 2006-05-01 at 12:07 +0200, Jes Sorensen wrote: > I think 2.i is the most reasonable. Scalability concerns should be > measured to ensure we're not prematurely optimizing/complicating and, if > measurements indicate the necessity, addressed as outlined above. > Lastly, 2.i has better overhead characteristics. There are no > differences in notification time overhead, while 2.ii has a significant > difference in space overhead. If only one or the other kind of chain is > in use 2.i uses less space, whereas 2.ii always uses more space. As I > pointed out earlier, this difference in space could be the difference > between fitting in cache (2.i) or not (2.ii) -- a definite performance > impact, and one that's likely more important to those running small to > medium numbers of processors. Obviously 2.i has better overhead for your specific case because you are so focused on memory consumption. However the real case is that you are likely to have maybe 2-3 notifiers per task, since what the per task notifier chain solves is also it gets rid of a ton of rarely used per task callouts that are currently hard coded. If we are looking at performance the cost of bouncing additional semaphores around is much more interesting than using an extra 50KB of memory on a real system. In addition the extra memory one needs to spend will be lost on hash tables tracking which notification group a task belongs to when called from the global callout mechanism. So when it comes to that, the memory consumption is going to be the same or possibly worse using the global mechanism. So using that argument, 2.i is clearly not the most reasonable solution. Or in other words, I think the real solution here is to provide both as it will lead to the lowest overhead and minimum memory consumption for real since won't need to apply shoehorn meassures to get from one method to the other. Anyway, simley intended. It's just proof that we are trying to solve two different problems. > If it's an rwsem and you don't allow unregistration within the handler > it would rarely, if ever, block on that semaphore because notification > could be protected by the "read" side and register/unregister could be > protected by the "write" side. An rwsem is still going to result in nasty atomic bus operations. For per-task notification it isn't going to be too heavy, but for a global notification chain this means you are going to have a global semaphore taken on each schedule call. That may not be too noticable on a 2-way box, but try it on a 16-way, or worse 512-way and you'll see some very unpleasant effects. In short, if we add a global callout chain, then I think it's even more important than for the per-task chain that the chain is RCU'd rather than uses a semaphore. The latter will just nuke the system. Cheers, Jes |