Re: BUG: corrupted list in cpu_stop_queue_work

From: Matthew Wilcox
Date: Mon Jul 09 2018 - 09:32:20 EST


On Mon, Jul 09, 2018 at 09:55:17PM +0900, Tetsuo Handa wrote:
> Hello Matthew,
>
> It seems to me that there are other locations which do not check xas_store()
> failure. Is that really OK? If they are OK, I think we want a comment like
> /* This never fails. */ or /* Failure is OK because ... */
> for each call without failure check.

Good grief, no, I'm not adding a comment to all 50 calls to
xas_store(). Here are some rules:

- xas_store(NULL) cannot fail.
- xas_store(p) cannot fail if we know something was already in
that slot beforehand (ie a replace operation).
- xas_store(p) cannot fail if xas_create_range() was previously
successful.
- xas_store(p) can fail, but it's OK if the only things after that are
other xas_*() calls. Because every xas_*() call checks xas_error().
So this is fine:

do {
xas_store(&xas, p);
xas_set_tag(&xas, XA_TAG_0);
} while (xas_nomem(&xas, GFP_KERNEL));

> >From d6f24d6eecd79836502527624f8086f4e3e4c331 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Date: Mon, 9 Jul 2018 15:58:44 +0900
> Subject: [PATCH] shmem: Fix crash upon xas_store() failure.
>
> syzbot is reporting list corruption [1]. This is because xas_store() from
> shmem_add_to_page_cache() is not handling memory allocation failure. Fix
> this by checking xas_error() after xas_store().

I have no idea why you wrote this patch on Monday when I already said
I knew what the problem was on Friday, fixed the problem and pushed it
out to my git tree on Saturday.