Re: [Linux v4.2] workqueue: llvmlinux: acpid: BUG: sleeping function called from invalid context at kernel/workqueue.c:2680

From: Jiri Kosina
Date: Thu Sep 24 2015 - 04:21:47 EST


On Thu, 24 Sep 2015, Sedat Dilek wrote:

> >> > >> [ 24.705767] [<ffffffff8149287d>] dump_stack+0x7d/0xa0
> >> > >> [ 24.705774] [<ffffffff810cbf7a>] ___might_sleep+0x28a/0x2a0
> >> > >> [ 24.705779] [<ffffffff810cbc7f>] __might_sleep+0x4f/0xc0
> >> > >> [ 24.705784] [<ffffffff810ae8ff>] start_flush_work+0x2f/0x290
> >> > >> [ 24.705789] [<ffffffff810ae8ac>] flush_work+0x5c/0x80
> >> > >> [ 24.705792] [<ffffffff810ae86a>] ? flush_work+0x1a/0x80
> >> > >> [ 24.705799] [<ffffffff810eddcd>] ? trace_hardirqs_off+0xd/0x10
> >> > >> [ 24.705804] [<ffffffff810ad938>] ? try_to_grab_pending+0x48/0x360
> >> > >> [ 24.705810] [<ffffffff81917e13>] ? _raw_spin_lock_irqsave+0x73/0x80
> >> > >> [ 24.705814] [<ffffffff810aecf9>] __cancel_work_timer+0x179/0x260
> >>
> >> This one is even more strange. It says that flush_work() is being called
> >> from __cancel_work_timer() with IRQs disabled, but flags are explicitly
> >> restored just one statement before that, and usbhid_close() explicitly
> >> calls cancel_work_sync() after unconditionally enabling interrupts.
> >>
> >> So I am not able to make any sense of either of the traces really.
> >>
> >> Are you seeing this with the same .config with GCC-compiled kernel as
> >> well?
> >
> > Actually could you please provide disassembly of your
> > __cancel_work_timer()?
> >
>
> Disassembly of which file - corresponding workqueue or hid file?

make kernel/workqueue.o
objdump -Dr kernel/workqueue.o

and copy/paste output for __cancel_work_timer function.

> > One explanation would be LLVM not considering local_irq_restore() a
> > compiler memory barrier, but I am pretty sure it'll expose much more
> > breakage if that'd be the case.
>
> Can you point me where I can find more informations about "compiler
> memory barrier" or explain in a few words if possible?

If compiler would not take "memory" clobber (while disabling IRQs) as a
reordering barrier, it wouldn't see any data dependency between
local_irq_restore(flags) and flush_work(data) and could reorder them,
resulting in flush_work() being called with IRQs disabled.

--
Jiri Kosina
SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/