Re: [PATCH] fix fault_in_multipages_...() on architectures with no-op access_ok()

From: Al Viro
Date: Tue Sep 20 2016 - 20:39:38 EST


On Tue, Sep 20, 2016 at 04:43:18PM -0700, Linus Torvalds wrote:
> [ Sorry, AFK there for a while ]
>
> On Tue, Sep 20, 2016 at 2:03 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > What proper mappings? If you manage to mmap something at (void*)-PAGE_SIZE,
> > you are very deep in trouble regardless. We use IS_ERR() on userland
> > pointers and any architecture where that would be possible would be confused
> > as hell.
>
> Ack, you've convinced me. Will apply the patch.

BTW, folks, there's an interesting part of that story in commit message
of af2e84:
-----------------------------------------------------------------------
pagemap.h: fix warning about possibly used before init var

Commit f56f821feb7b ("mm: extend prefault helpers to fault in more than
PAGE_SIZE") added in the new functions: fault_in_multipages_writeable()
and fault_in_multipages_readable().

However, we currently see:

include/linux/pagemap.h:492: warning: 'ret' may be used uninitialized in this function
include/linux/pagemap.h:492: note: 'ret' was declared here

Unlike a lot of gcc nags, this one appears somewhat legit. i.e. passing
in an invalid negative value of "size" does make it look like all the
conditionals in there would be bypassed and the uninitialized value would
be returned.
-----------------------------------------------------------------------

It wasn't just "somewhat" legit - it was entirely accurate and pointing to
real bug. The real reason why the loop could be bypassed wasn't the negative
'size' (that really couldn't happen), it was the possibility of wraparound.
Which had been masked on most of the architectures by checks that had been
pretty far upstream from that function (entire range passing access_ok())
and architecture-dependent on top of that. Some architectures would always
prevent wraparounds in access_ok() (x86, for example), some would only do
so outside of set_fd(KERNEL_DS) (where the caller is responsible for validity
of addresses and lengths) and some did not prevent wraparounds on access_ok()
level at all. IOW, it was OK on a lot of architectures, but only for rather
subtle reasons. Making ret initialized to 0 did make cc(1) STFU, but didn't
fix the bug...

It was used only in intel-specific drivers (i915 is x86-only), arm-specific
one (armada) and in NTFS; the last one isn't arch-specific, but I doubt that
anyone has really mounted NTFS on e.g. sparc, let alone hit it with LTP or
fuzzing there. Multipage path in BTRFS write would benefit from it, and
BTRFS probably did get tested on sparc and/or s390, but BTRFS kept using
plain fault-in (and paid for that with more frequent fallbacks to singlepage
codepath). So it went unnoticed, until it did show up on a path from
generic_perform_write()...