Re: amusing SLUB compaction bug when CC_OPTIMIZE_FOR_SIZE

From: Vlastimil Babka
Date: Fri Nov 04 2022 - 11:58:04 EST


On 10/24/22 16:35, Vlastimil Babka wrote:
On 10/3/22 19:00, Matthew Wilcox wrote:
On Sun, Oct 02, 2022 at 02:48:02PM +0900, Hyeonggon Yoo wrote:
Just one more thing, rcu_leak_callback too. RCU seem to use it
internally to catch double call_rcu().

And some suggestions:
- what about adding runtime WARN() on slab init code to catch
unexpected arch/toolchain issues?
- instead of 4, we may use macro definition? like (PAGE_MAPPING_FLAGS + 1)?

I think the real problem here is that isolate_movable_page() is
insufficiently paranoid. Looking at the gyrations that GUP and the
page cache do to convince themselves that the page they got really is
the page they wanted, there are a few missing pieces (eg checking that
you actually got a refcount on _this_ page and not some random other
page you were temporarily part of a compound page with).

This patch does three things:

- Turns one of the comments into English. There are some others
which I'm still scratching my head over.
- Uses a folio to help distinguish which operations are being done
to the head vs the specific page (this is somewhat an abuse of the
folio concept, but it's acceptable)
- Add the aforementioned check that we're actually operating on the
page that we think we want to be.
- Add a check that the folio isn't secretly a slab.

We could put the slab check in PageMapping and call it after taking
the folio lock, but that seems pointless. It's the acquisition of
the refcount which stabilises the slab flag, not holding the lock.


I would like to have a working safe version in -next, even if we are able
simplify it later thanks to frozen refcounts. I've made a formal patch of
yours, but I'm still convinced the slab check needs to be more paranoid so
it can't observe a false positive __folio_test_movable() while missing the
folio_test_slab(), hence I added the barriers as in my previous attempt [1].
Does that work for you and can I add your S-o-b?

[1] https://lore.kernel.org/all/aec59f53-0e53-1736-5932-25407125d4d4@xxxxxxx/

To move on, I pushed a branch based on a new version of [1] above. It lacks Matthew's folio parts, which are not IMHO that critical right now, so can be added later.

It's here: https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/log/?h=slab/for-6.2/fit_rcu_head

Will also send for formal review soon.