On Tuesday 22 August 2006 01:04, Jeff Dike wrote:
> On Sat, Aug 19, 2006 at 05:52:37PM +0200, Blaisorblade wrote:
> > Have you found a tool for this or just done by hand? (I've seen
> > Understanding the Linux VMM talking about tools for call graphs).
> I spent an afternoon doing it by hand.
> > After the networking locking patch, I'll recheck, but uml_net_open should
> > now be covered by a simple mutex so there's no problem.
> > I also think that most of them should become mutexes or disappear.
> > > Every procedure can call activate_fd under a spinlock is so marked.
> > >
> > > activate_fd
> > > um_request_irq
> > > line_setup_irq
> > > enable_chan
> > > line_open - SPINLOCK
> > This should be a mutex, but it's not very simple to do this since the tty
> > layer is not so nice as char/block/network layer, to my current
> > knowledge.
> > So, the other road is that you simply reduce spinlock holding time, or
> > that you merge this patch when I resend (I've done some changes).
> > Everything however applies to SMP only, luckily, for now; IRQ disabling
> > can cause hangs/races on UP kernels.
> > And, sadly, sigio_lock must become a irq disabling spinlock; so once, at
> > boot time, we'll have a call to sleeping kmalloc with disabled irqs, if
> > you don't merge this, and this call can crash.
> > > uml_net_open - struct net_device.open, SPINLOCK
> > Same stuff, converted in my patch about network (which I'll resend).
> > > write_sigio_irq
> > > write_sigio_workaround - SPINLOCK
> > > maybe_sigio_broken
> > > activate_fd
> > You deleted (in [PATCH 1/4] UML - SIGIO cleanups) a comment about turning
> > that spinlock into a semaphore - the comment talked about reactivate_fd,
> > however it is still a fact that maybe_sigio_broken() should be an
> > initcall.
> For reference, here is that comment.
> - /* Critical section - locked by a spinlock because this stuff can
> - * be changed from interrupt handlers. The stuff above is done
> - * outside the lock because it allocates memory.
> - */
> - /* Actually, it only looks like it can be called from interrupt
> - * context. The culprit is reactivate_fd, which calls
> - * maybe_sigio_broken, which calls write_sigio_workaround,
> - * which calls activate_fd. However, write_sigio_workaround should
> - * only be called once, at boot time. That would make it clear
> that - * this is called only from process context, and can be locked
> with - * a semaphore.
> - */
> Perhaps I was overzealous about removing that last sentence, but the rest
> of it deserves to go.
> I would just as soon leave maybe_sigio_broken as-is. It starts a new
> thread, which is kind of expensive if it's never going to be used.
I don't know, does it happen? I.e. is is possible that this thread is never
> On the other hand, maybe the code will be noticably cleaner (i.e. no
> spinlock protecting the have-I-been-called flag).
That spinlock is probably overkill - it is correct but something simpler
should be possible (not an atomic_t however if it is to be used on another
host thread - see my other mail "UniProcessor UML running on SMP host:
> > It should not - when the variable is first set in an initcall, you can
> > simply add a memory barrier to ensure visibility; maybe you do not even
> > need the barrier.
> There's no need for a memory barrier in that case - initcalls are
I know that. But if an initcall sets up an interrupt handler which is executed
on another CPU, how do you guarantee coherency?
> > I hope many of the following notes are wrong and that you can document it
> > and that I can understand my errors. However I cannot ignore my feelings
> > about code & data structures, which are truely redundant.
> > current_poll is maybe distinct since all SIGIO handling also
> > triggers a single IRQ (requested by write_sigio_irq), but that IRQ
> > simply consumes the wakeup from the fd and reactivates itself, so:
> > * I don't see its purpose: statistics? Triggering IRQs processing?
> Triggering IRQ processing - when one interrupt comes in, poll is used
> to check all active interrupt sources, so this suffices to handle
> whatever happened on any descriptor.
> > * This is somehow strange - and leads to the above recursion.
> Which recursion? The activate_fd -> activate-fd recursion? That's
> caused by the on-demand activation of the sigio thread, and would be
> eliminated by your proposal to make it an initcall. However, this
> doesn't have anything to do with the property that simply receiving
> this IRQ causes all pending I/O to be handled.
> > * I'd like you to write up _why_ you _actively_ build a model where
> > write_sigio_thread(), after there is activity on a single fd,
> > removes it from current_poll, while the irq handler almost always
> > calls reactivate_fd() to undo this; most drivers do it for each IRQ,
> > except for TELNETD_IRQ and XTERM_IRQ, which must be one-shot; but
> > they call free_irq, so this is not very useful for them, and anyway
> > they should request this behaviour with a flag if they need it (or
> > they can ignore subsequent irqs until irq freeing by setting their
> > own flags).
> The idea is to avoid subsequent IRQs on the same descriptor while
> input is being handled. The drivers will drain pending input anyway,
> so continuing to receive interrupts is just a waste of time. It might
> be a useful cleanup to move the reactivation out of the drivers to a
> common place.
Ok, new interrupts on that descriptor should not be generated while the IRQ is
active. Nice point (and something on real hardware is hard to do - NetPoll on
Gigabit eth NICs is all about this).
> > In particular (and a comment patch is preferred to a mail) why
> > arch/um/os-Linux/sigio.c has a list of fds to poll, i.e. current_poll,
> > and arch/um/os-Linux/irq.c has another one, i.e. pollfds, and there is
> > finally active_fds used for irq too; also if current_poll excludes just
> > the "SIGIO write" irq it is also redundant.
> I'll send mail anyway, and work up a patch later :-)
> current_poll is used by the sigio thread, and is the set of
> descriptors for which SIGIO doesn't work. It's a subset of
> pollfds is forced on us by the host kernel, but there is not enough
> information in it, i.e. no dev_id that can be passed to the IRQ
> So, active_fds is the full set of interrupt source information, from
> which pollfds is calculated.
> > arch/um/os-Linux/irq.c:os_free_irq_by_cb, line 94 reads as:
> > printk("os_free_irq_by_cb - mismatch between "
> > "active_fds and pollfds, fd %d vs %d\n",
> > which can be probably considered as a sign of non-normalization (in DB
> > parlance) of data structure.
> Normalization means something different to me - when there are
> multiple representation of something (like 1/2, 2/4, etc), you pick
> one, and that's the normalized representation. I don't if there's an
> official term for what the printk is talking about - I would say
> they're out of sync, or inconsistent, or incoherent.
I'm talking (informally and imprecisely) about what's (formally) called
Boyce-Codd normal form of a database scheme - there are further properties
but avoiding duplication is one.
> In any case, that's why there is a complaint if is a mismatch between
> > active_fds could probably be a set of pointers to pollfds array -
> > there should theoretically be a single array holding everything and
> > active_fds pointing to it.
> Well, you could do that. There's a choice between
> two separate arrays holding different sets of information
> about the descriptors - the pollfds holding most of it, and something
> else holding the dev_ids
> a unified structure from which the pollfds are constructed -
> and thus the pollfds are redundant.
> I don't see a strong reason to prefer one over the other. What we have
> now seems a bit cleaner to me.
The flow isn't strictly "unified array"->pollfds at once, it's "initial
calculation" and then "propagate updates". And you do message passing (with
synchronization on pipes) to synchronize them. At least some "shared memory
communication" scheme can be found - I'm thinking that instead of sending
messages one thread prepares the updated structure. This cannot be done until
the sigio thread clears entries in the array.
> > Failing that (since the pollfds array must be such to be used by poll), I
> > do not see why active_fds must contain fds instead of relying on ones in
> > pollfds.
> I think this is a personal preference - I like a unified
> representation where I can find everything I want, at the expense of
> some memory. There's no reason it can't be done the way you say - I
> just find it a bit clumsier.
It is possible that in this case this is the best choice, but I'm worried
about possible incoherence in data structures, and about the continue
recalculation (not about the overhead, about the thing per se).
Talking in general (I must still check this in the code), if you cause
yourself headaches, add locks, additional checks to maintain coherence, while
you can unify data structures, then you can change your data structures with
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade
Chiacchiera con i tuoi amici in tempo reale!