Re: Linux 4.18-rc7

From: Amit Pundir
Date: Tue Jul 31 2018 - 02:40:46 EST


On Tue, 31 Jul 2018 at 09:55, John Stultz <john.stultz@xxxxxxxxxx> wrote:
>
> On Mon, Jul 30, 2018 at 8:26 PM, Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> > On Mon, 30 Jul 2018, Linus Torvalds wrote:
> >> On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> >> >
> >> > I have no problem with reverting -rc7's vma_is_anonymous() series.
> >>
> >> I don't think we need to revert the whole series: I think the rest are
> >> all fairly obvious cleanups, and shouldn't really have any semantic
> >> changes.
> >
> > Okay.
> >
> >>
> >> It's literally only that last patch in the series that then changes
> >> that meaning of "vm_ops". And I don't really _mind_ that last step
> >> either, but since we don't know exactly what it was that it broke, and
> >> we're past rc7, I don't think we really have any option but the revert
> >> it.
> >
> > It took me a long time to grasp what was happening, that that last
> > patch bfd40eaff5ab was fixing. Not quite explained in the commit.
> >
> > I think it was that by mistakenly passing the vma_is_anonymous() test,
> > create_huge_pmd() gave the MAP_PRIVATE kcov mapping a THP (instead of
> > COWing pages from kcov); which the truncate then had to split, and in
> > going to do so, again hit the mistaken vma_is_anonymous() test, BUG.
> >
> >>
> >> And if we revert it, I think we need to just remove the
> >> VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
> >> it is quite likely that the real bug is that overzealous BUG_ON(),
> >> since I can't see any reason why anonymous mappings should be special
> >> there.
> >
> > Yes, that probably has to go: but it's not clear what state it leaves
> > us in, with an anon THP being split by a truncate, without the expected
> > locking; I don't remember offhand, probably a subtler bug than that BUG,
> > which you may or may not consider an improvement.
> >
> > I fear that Kirill has not missed inserting a vma_set_anonymous() from
> > somewhere that it should be, but rather that zygote is working with some
> > special mapping which used to satisfy vma_is_anonymous(), faults supplying
> > backing pages, but now comes out as !vma_is_anonymous(), so do_fault()
> > finds !dummy_vm_ops.fault hence SIGBUS.
>
> I've been only casually following this thread (mostly just glad Amit
> caught it and I could avoid having to bisect the issue in my own
> Android testing), but this bit starting to shake a few old cobwebs
> loose in my brain.
>
> I'm wondering if Zygote is utilizing ashmem here, and we're somehow
> traversing ashmem purged memory, or due to some setup issue the
> initial traverse isn't being zero-filled as expected?
>
> ashmem ranges are created using: shmem_file_setup() and shmem_zero_setup()
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n377
>
>
> If we purge pages, it punches it out with:
> vfs_fallocate(range->asma->file,
> FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> start, end - start);
> here:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n447
>
> But in ashmem_pin(), we don't do anything other then returning if we
> purged any page in the range.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n577
>
> And I believe the future assumption is the if we traverse those pages
> they will be zero filled (if purged or even during the initial
> traversal after mmap)
>
> Its been a long time, and I've not looked at the code in question but
> it sounds from Hugh's comments above that we might instead get a
> SIGBUS here.
>
> Looking more at the problematic patch..
> Amit: Does adding something like (whitespace damaged, apologies):
>
> index a1a0025..1af6915 100644
> --- a/drivers/staging/android/ashmem.c
> +++ b/drivers/staging/android/ashmem.c
> @@ -402,7 +402,8 @@ static int ashmem_mmap(struct file *file, struct
> vm_area_struct *vma)
> fput(asma->file);
> goto out;
> }
> - }
> + } else
> + vma_set_anonymous(vma);
>
> if (vma->vm_file)
> fput(vma->vm_file);
>

This ashmem change ^^ worked too.

Regards,
Amit Pundir

>
> Seem to resolve it? (Sorry, I'd test it myself, but I'm away from my
> desk for the night).
> thanks
> -john