Re: WARNING: at arch/x86_64/kernel/smp.c:397 smp_call_function_mask()

From: Laurent Vivier
Date: Fri Sep 28 2007 - 08:08:24 EST


Andrew Morton wrote:
> On Fri, 28 Sep 2007 11:18:45 +0200 Laurent Vivier <Laurent.Vivier@xxxxxxxx> wrote:
>
>> Andrew Morton wrote:
>>> On Fri, 28 Sep 2007 10:52:08 +0200 Laurent Vivier <Laurent.Vivier@xxxxxxxx> wrote:
>>>> Andi, is this correct ?
>>>> Andrew, should I send a patch implementing this change ?
>>> umm, I think all the smp_call_function fucntions are deadlocky if called
>>> with local interrupts disabled, regardless of whether the calling CPU is in
>>> the mask.
>>>
>>> If CPU A is sending a cross-cpu call to CPU B and CPU B is sending a
>>> cross-cpu call to CPU A, and they both have local interrupts disabled...
>> OK, so there are two errors:
>>
>> 1- one I introduce myself (without any help from anyone) where
>> smp_call_function() calls all online CPUs instead of calling all CPUs except itself.
>
> I'd be pretty surprised if one was able to write a bug like that. You mean
> the CPU sends an IPI to itself and then loops around until it has serviced
> that IPI? And this works? Wow.

In fact, it works because __smp_call_function_mask() makes :

cpus_and(mask, mask, allbutself);

So it removes current cpu from the mask. BTW, I don't have to modify
smp_call_function(): it is correct as it is written (except from a semantic
point of view... do we care about semantic ?).

> And on_each_cpu() can call the handler function twice?
>
> hm
>
>> 2- one in global_flush_tlb() which calls smp_call_function() with irqs disabled.
>
> That would be a big bug, and surely we would already have picked it up.
>
> <looks>
>
> argh, mainline's x86_64 smp_call_function() doesn't do the check. We've
> had *heaps* of bugs in i386 where people were running smp_call_foo() under
> local_irq_disable(). I wonder how many there are in x86_64?
>
>> I think I should at least correct #1 ?
>
> I think we should correct all bugs ;)
>

We don't live in a perfect world... ;-)

Moreover, I'm not able to reproduce #2 ...

Fengguang could you send me your .config and your qemu command line parameters ?

Laurent

PS: if semantic is important, you can apply attached patch...
--
------------- Laurent.Vivier@xxxxxxxx --------------
"Software is hard" - Donald Knuth

Attachment: 0001-smp_call_function-call-function-on-all-CPUs.patch
Description: application/mbox

Attachment: signature.asc
Description: OpenPGP digital signature