Re: bio linked list corruption.

From: Vegard Nossum
Date: Tue Dec 06 2016 - 03:42:16 EST


On 5 December 2016 at 22:33, Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
> On 5 December 2016 at 21:35, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> Note for Ingo and Peter: this patch has not been tested at all. But
>> Vegard did test an earlier patch of mine that just verified that yes,
>> the issue really was that wait queue entries remained on the wait
>> queue head just as we were about to return and free it.
>
> The second patch has been running for 1h+ without any problems of any
> kind. I should typically have seen 2 crashes by now. I'll let it run
> overnight to be sure.

Alright, so nearly 12 hours later I don't see either the new warning
or the original crash at all, so feel free to add:

Tested-by: Vegard Nossum <vegard.nossum@xxxxxxxxxx>.

That said, my 8 VMs had all panicked in some way due to OOMs (which is
new since v4.8), although some got page allocation stalls for >20s and
died because "khugepaged blocked for more than 120 seconds", others
got "Out of memory and no killable processes".


Vegard