Re: linux-next: Tree for December 11

From: Alexey Zaytsev
Date: Wed Jan 07 2009 - 14:10:34 EST


On Wed, Jan 7, 2009 at 21:47, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * Alexey Zaytsev <alexey.zaytsev@xxxxxxxxx> wrote:
>
>> And last time I bisected, it pointed to:
>>
>> commit 7317d7b87edb41a9135e30be1ec3f7ef817c53dd
>> Author: Nick Piggin <nickpiggin@xxxxxxxxxxxx>
>> Date: Tue Sep 30 20:50:27 2008 +1000
>>
>> sched: improve preempt debugging
>>
>>
>> This patch helped me out with a problem I recently had....
>>
>> Basically, when the kernel lock is held, then preempt_count
>> underflow does not
>> get detected until it is released which may be a long time (and arbitrarily,
>> eg at different points it may be rescheduled). If the bkl is released at
>> schedule, the resulting output is actually fairly cryptic...
>>
>> With any other lock that elevates preempt_count, it is illegal to schedule
>> under it (which would get found pretty quickly). bkl allows scheduling with
>> preempt_count elevated, which makes underflows hard to debug.
>>
>> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>>
>> so at least a dumb bisection won't do here.
>
> ah, sorry for being a slow starter, i missed that bit - merge window
> attention span troubles ...
>
> I think the kernel_locked() check added here is plain buggy against IRQ
> contexts: we drop the BKL spinlock and reduce current->kernel_depth
> non-atomically.
>
> So kernel_locked() can become detached from the preempt_count().
>
> Nick, can you think of any better way of still saving this debug check, or
> should we revert it?
>
> Although it seems a bit weird how consistently you seem to be able to
> trigger it - as this seems to be a narrow race. Is there an IRQ storm
> there perhaps, or something widens things up for Qemu to inject an IRQ
> right there?

I'm not sure about the qemu case, but at least on my laptop it happens
somewhere along

arch/x86/kernel/cpu/bugs.c:
92 printk(KERN_INFO "Checking 'hlt' instruction... ");
93 if (!boot_cpu_data.hlt_works_ok) {
94 printk("disabled\n");
95 return;
96 }
97 halt();
98 halt();
99 halt();
100 halt();
101 printk("OK.\n");

where an interrupt has to come in order to get the cpu from hlt, so
there is no surprise that I'm seeing this on every single boot. ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/