Re: BUG: corrupted list in freeary

From: Manfred Spraul
Date: Sat Dec 01 2018 - 15:35:44 EST


Hi Dmitry,

On 11/30/18 6:58 PM, Dmitry Vyukov wrote:
On Thu, Nov 29, 2018 at 9:13 AM, Manfred Spraul
<manfred@xxxxxxxxxxxxxxxx> wrote:
Hello together,

On 11/27/18 4:52 PM, syzbot wrote:

Hello,

syzbot found the following crash on:

HEAD commit: e195ca6cb6f2 Merge branch 'for-linus' of git://git.kernel...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10d3e6a3400000
[...]
Isn't this a kernel stack overrun?

RSP: 0x..83e008. Assuming 8 kB kernel stack, and 8 kB alignment, we have
used up everything.
I don't exact answer, that's just the kernel output that we captured
from console.

FWIW with KASAN stacks are 16K:
https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/page_64_types.h#L10
Ok, thanks. And stack overrun detection is enabled as well -> a real stack overrun is unlikely.
Well, generally everything except for kernel crashes is expected.

We actually sandbox it with memcg quite aggressively:
https://github.com/google/syzkaller/blob/master/executor/common_linux.h#L2159
But it seems to manage to either break the limits, or cause some
massive memory leaks. The nature of that is yet unknown.

Is it possible to start from that side?

Are there other syzcaller runs where the OOM killer triggers that much?


- Which stress tests are enabled? By chance, I found:

[ 433.304586] FAULT_INJECTION: forcing a failure.^M
[ 433.304586] name fail_page_alloc, interval 1, probability 0, space 0,
times 0^M
[ 433.316471] CPU: 1 PID: 19653 Comm: syz-executor4 Not tainted 4.20.0-rc3+
#348^M
[ 433.323841] Hardware name: Google Google Compute Engine/Google Compute
Engine, BIOS Google 01/01/2011^M

I need some more background, then I can review the code.
What exactly do you mean by "Which stress tests"?
Fault injection is enabled. Also random workload from userspace.


Right now, I would put it into my "unknown syzcaller finding" folder.

One more idea: Are there further syzcaller runs that end up with 0x010000 in a pointer?

From what I see, the sysv sem code that is used is trivial, I don't see that it could cause the observed behavior.


--

ÂÂÂ Manfred