Re: bio linked list corruption.

From: Vegard Nossum
Date: Mon Dec 05 2016 - 15:11:06 EST


On 5 December 2016 at 20:11, Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
> On 5 December 2016 at 18:55, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
>> Since you apparently can recreate this fairly easily, how about trying
>> this stupid patch?
>>
>> NOTE! This is entirely untested. I may have screwed this up entirely.
>> You get the idea, though - just remove the wait queue head from the
>> list - the list entries stay around, but nothing points to the stack
>> entry (that we're going to free) any more.
>>
>> And add the warning to see if this actually ever triggers (and because
>> I'd like to see the callchain when it does, to see if it's another
>> waitqueue somewhere or what..)
>
> ------------[ cut here ]------------
> WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0
> Kernel panic - not syncing: panic_on_warn set ...

So I noticed that panic_on_warn just after sending the email and I've
been waiting for it it to trigger again.

The warning has triggered twice more without panic_on_warn set and I
haven't seen any crash yet.


Vegard