Andrew, generally I think a good patch. The problem you are
describing has been happening occasionally in some of our
benchmarking.
In a few cases, we've upgraded a lock from spin_lock to spin_lock_irq
to avoid the problem. We've also considered replacing the spin_lock
body with the code from spin_lock_irq and making spin_lock_irq be
a macro defined to spin_lock (make all spin locks ignore interrupts).
However, this means that when a device raises an interrupt, [on Intel] the
IO APIC will select a CPU to deliver that interrupt to. With interrupts
disabled at the CPU, the device interrupt will remain pending until
the CPU returns from the lock, delaying the delivery of the interrupt
(increasing the latency of interrupt delivery).
One solution that we used in a past OS was to actually mark the CPUs
IO APIC as "unwilling to handle an interrupt" so that the IO APIC
would select a different CPU to handle the device interrupt.
We have that patch ported to 2.4.X (6, 9, 14?) and it turns out that
it is going to be very important for Foster [on the Pentium III and
lower IO APICs, the CPU is chosen in a "round robin" fashion. In
Foster/Pentium 4, it appears that all interrupts are deliverd only
to CPU 0 by the IO APIC]. This would *seem* to be a better solution
to the general problem (it goes even further than your current
proposed solution).
However, in some benchmarking, we haven't found as much of an increase
as expected. And, it turns out that IRQ binding and CPU binding have
a greater effect because of cache effects, then simply reducing the
latency. However, in principle, it seems that this approach would be
a better general approach.
Are you at all interested in evaluating the patch yourself? We are
currently wondering if we've had some oversight in the patch that
keeps the results from being as good as we expected - the goal should
be reduced interrupt delivery latency, which should improve network
and disk IO throughput/responsiveness.
If so, Dave Olien @ IBM can supply the patch. I may also consider
including your current patch in the LSE roll up patch that we are
working on as I would expect that it has some measurable benefit.
I'll also see if our performance team has the bandwidth to do some
isolated testing on this. Would you be able to recommend any particular
benchmark you'd like to see this tested with?
gerrit
> If a CPU take an interrupt while holding a contended spinlock,
> the cost of the interrupt is borne not only by the CPU which takes
> it, but also by all the CPUs which are waiting for the lock.
>
> So here's an x86 patch which disables cpu-local interrupts
> whenever that CPU is holding a spinlock.
>
> I don't have enough CPUs to measure its effects, but perhaps
> someone who has an 8-way may be interested in trying it with
> a lock- and interrupt-intensive workload, see what happens?
>
>
> --- linux-2.4.15-pre5/include/asm-i386/spinlock.h Mon Nov 5 21:01:12 2001
> +++ linux-akpm/include/asm-i386/spinlock.h Fri Nov 16 18:58:49 2001
> @@ -25,6 +25,7 @@ extern int printk(const char * fmt, ...)
>
> typedef struct {
> volatile unsigned int lock;
> + unsigned long flags;
> #if SPINLOCK_DEBUG
> unsigned magic;
> #endif
> @@ -79,15 +80,18 @@ typedef struct {
>
> static inline void spin_unlock(spinlock_t *lock)
> {
> + unsigned long flags;
> #if SPINLOCK_DEBUG
> if (lock->magic != SPINLOCK_MAGIC)
> BUG();
> if (!spin_is_locked(lock))
> BUG();
> #endif
> + flags = lock->flags;
> __asm__ __volatile__(
> spin_unlock_string
> );
> + local_irq_restore(flags);
> }
>
> #else
> @@ -99,6 +103,7 @@ static inline void spin_unlock(spinlock_
>
> static inline void spin_unlock(spinlock_t *lock)
> {
> + unsigned long flags;
> char oldval = 1;
> #if SPINLOCK_DEBUG
> if (lock->magic != SPINLOCK_MAGIC)
> @@ -106,9 +111,11 @@ static inline void spin_unlock(spinlock_
> if (!spin_is_locked(lock))
> BUG();
> #endif
> + flags = lock->flags;
> __asm__ __volatile__(
> spin_unlock_string
> );
> + local_irq_restore(flags);
> }
>
> #endif
> @@ -116,11 +123,15 @@ static inline void spin_unlock(spinlock_
> static inline int spin_trylock(spinlock_t *lock)
> {
> char oldval;
> + int ret;
> __asm__ __volatile__(
> "xchgb %b0,%1"
> :"=q" (oldval), "=m" (lock->lock)
> :"0" (0) : "memory");
> - return oldval > 0;
> + ret = (oldval > 0);
> + if (ret)
> + local_irq_save(lock->flags);
> + return ret;
> }
>
> static inline void spin_lock(spinlock_t *lock)
> @@ -136,6 +147,7 @@ printk("eip: %p\n", &&here);
> __asm__ __volatile__(
> spin_lock_string
> :"=m" (lock->lock) : : "memory");
> + local_irq_save(lock->flags);
> }
>
> _______________________________________________
> Lse-tech mailing list
> Lse-tech@...
> https://lists.sourceforge.net/lists/listinfo/lse-tech
>
>
|