Re: smp_call_function_single lockups

From: Ingo Molnar
Date: Fri Feb 20 2015 - 14:41:24 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Fri, Feb 20, 2015 at 1:30 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> > So if my memory serves me right, I think it was for
> > local APICs, and even there mostly it was a performance
> > issue: if an IO-APIC sent more than 2 IRQs per 'level'
> > to a local APIC then the IO-APIC might be forced to
> > resend those IRQs, leading to excessive message traffic
> > on the relevant hardware bus.
>
> Hmm. I have a distinct memory of interrupts actually
> being lost, but I really can't find anything to support
> that memory, so it's probably some drug-induced confusion
> of mine. I don't find *anything* about interrupt "levels"
> any more in modern Intel documentation on the APIC, but
> maybe I missed something. But it might all have been an
> IO-APIC thing.

So I just found an older discussion of it:

http://www.gossamer-threads.com/lists/linux/kernel/1554815?do=post_view_threaded#1554815

while it's not a comprehensive description, it matches what
I remember from it: with 3 vectors within a level of 16
vectors we'd get excessive "retries" sent by the IO-APIC
through the (then rather slow) APIC bus.

( It was possible for the same phenomenon to occur with
IPIs as well, when a CPU sent an APIC message to another
CPU, if the affected vectors were equal modulo 16 - but
this was rare IIRC because most systems were dual CPU so
only two IPIs could have occured. )

> Well, the attached patch for that seems pretty trivial.
> And seems to work for me (my machine also defaults to
> x2apic clustered mode), and allows the APIC code to start
> doing a "send to specific cpu" thing one by one, since it
> falls back to the send_IPI_mask() function if no
> individual CPU IPI function exists.
>
> NOTE! There's a few cases in
> arch/x86/kernel/apic/vector.c that also do that
> "apic->send_IPI_mask(cpumask_of(i), .." thing, but they
> aren't that important, so I didn't bother with them.
>
> NOTE2! I've tested this, and it seems to work, but maybe
> there is something seriously wrong. I skipped the
> "disable interrupts" part when doing the "send_IPI", for
> example, because I think it's entirely unnecessary for
> that case. But this has certainly *not* gotten any real
> stress-testing.

I'm not so sure about that aspect: I think disabling IRQs
might be necessary with some APICs (if lower levels don't
disable IRQs), to make sure the 'local APIC busy' bit isn't
set:

we typically do a wait_icr_idle() call before sending an
IPI - and if IRQs are not off then the idleness of the APIC
might be gone. (Because a hardirq that arrives after a
wait_icr_idle() but before the actual IPI sending sent out
an IPI and the queue is full.)

So the IPI sending should be atomic in that sense.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/