Re: [PATCH] x86: fix PAE pmd_bad bootup warning

From: Nishanth Aravamudan
Date: Thu May 08 2008 - 12:07:48 EST


On 06.05.2008 [21:30:44 +0100], Hugh Dickins wrote:
> On Tue, 6 May 2008, Linus Torvalds wrote:
> > On Tue, 6 May 2008, Hugh Dickins wrote:
> > >
> > > Fix Hans' good observation that follow_page() will never find
> > > pmd_huge() because that would have already failed the pmd_bad
> > > test: test pmd_huge in between the pmd_none and pmd_bad tests.
> > > Tighten x86's pmd_huge() check? No, once it's a hugepage entry,
> > > it can get quite far from a good pmd: for example, PROT_NONE
> > > leaves it with only ACCESSED of the KERN_PGTABLE bits.
> >
> > I'd much rather have pdm_bad() etc fixed up instead, so that they do
> > a more proper test (not thinking that a PSE page is bad, since it
> > clearly isn't). And then, make them dependent on DEBUG_VM, because
> > doing the proper test will be more expensive.
>
> But everywhere we use pmd_bad() etc (most are hidden inside
> pmd_none_or_clear_bad() etc) we are expecting never to encounter
> a pmd_huge, unless there's corruption. follow_page() is the one
> exception, and even in its case I can't find a current user that
> actually could meet a hugepage. I'd rather tighten up pmd_bad
> (in the PAE and x86_64 cases), than weaken it so far as to let
> hugepages slip through. Not that pmd_bad often catches anything:
> just coincidentally that 90909090 one today.

There is one case that seems to the source of Hans' problem, as Dave has
figured out: /proc/pid/pagemap, where we fairly straight-forwardly walk
the pagetables.

In there, we unconditionally call pmd_none_or_clear_bad(pmd). And any
userspace process that maps hugepages and then reads in
/proc/pid/pagemap will invoke that path, I think (at least with 2M
pages).

So I agree, you're fixing a potential issue in follow_page() [might
deserve a comment, so someone doesn't go and combine them back later?],
but Hans' issue is most likely related to the pagemap code?

Thanks,
Nish

--
Nishanth Aravamudan <nacc@xxxxxxxxxx>
IBM Linux Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/