Re: [PATCH -v2] use per cpu data for single cpu ipi calls

From: Ingo Molnar
Date: Thu Jan 29 2009 - 13:50:42 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> And are you really sure it cannot be called from within interrupts? I'm
> finding a lot of callers of smp_call_function_single(), and while I
> couldn't find any that look like interrupts, I also couldn't find any
> indication that it never happens.

smp_call_function_single() must not be called from irqs-off code - and
that implicitly includes most IRQ handlers as well. (most of which run
with hardirqs off)

We check for that via:

/* Can deadlock when called with interrupts disabled */
WARN_ON(irqs_disabled());

[ although it should probably be extended to '&& !in_interrupt()' as well,
to cover the subset of IRQ handlers running with irqs enabled - which is
small but nonzero. It would also cover softirqs. ]

There are 3 reasons why sending IPIs from an irqs off critical section is
a bad idea currently:

1) the implementation is deadlockable as you noted. That is self-inflicted
stupidity and fixable.

2) the _concept_ of two CPUs sending IPIs to each other at once and
expecting them to be handled and then waiting for them is a deadlock.
We could allow async IPIs from irqs-off sections, but isnt that a too
narrow special-case?

3) the third reason is nasty: the act of sending an IPI on x86 hardware
needs to happen with interrupts enabled. In the lowlevel IPI sending
code we first poll the hardware whether it's ready to accept new IPIs:
via apic_wait_icr_idle().

There've been incidents in the past where the ICR-ready (Interrupt
Command Register - ready) bit would deadlock the box in certain
situations (usually with wackier APIC implementations and if there's
too many IPIs 'in flight' and the only way to make progress is to allow
IPIs to happen locally - i.e. to have irqs on), if the IPI was sent
from an irqs-off section.

So i'm not sure we could relax things here without expecting too much of
the hardware.

What i like least about kernel/smp.c is the kmalloc(GFP_ATOMIC) - and we
had to add that in a late -rc for an essential bugfix.

IMO that kmalloc() has "layering violation" written all over it. It turns
the pure act of 'sending messages' into a conceptually heavy and wide
facility. Is the buffering of async messages really such a big deal in
practice? Does anyone have any numbers?

Btw., there _is_ a relatively atomic, safe and instantaneous way of
sending IPIs on x86: sending NMIs. But that has its own host of issues
both on the hardware and on the software side ...

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/