Re: [PATCH] mm: Check if PTE is already allocated during page fault

From: Mel Gorman
Date: Mon Apr 18 2011 - 06:23:17 EST


On Fri, Apr 15, 2011 at 05:06:06PM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 15, 2011 at 04:39:16PM +0200, Andrea Arcangeli wrote:
> > On Fri, Apr 15, 2011 at 11:12:48AM +0100, Mel Gorman wrote:
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 5823698..1659574 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > > * run pte_offset_map on the pmd, if an huge pmd could
> > > * materialize from under us from a different thread.
> > > */
> > > - if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > > + if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
>
> I started hacking on this and I noticed it'd be better to extend the
> unlikely through the end. At first review I didn't notice the
> parenthesis closure stops after pte_none and __pte_alloc is now
> uncovered. I'd prefer this:
>
> if (unlikely(pmd_none(*pmd) && __pte_alloc(mm, vma, pmd, address)))
>

I had this at one point and then decided to match what we do for
pte_alloc_map(). My reasoning was that the most important part of this
check is pmd_none(). It's relatively unlikely we even call __pte_alloc
which is why I didn't think it belonged in the unlikely block. I also
preferred being consistent with pte_alloc_map.

> I mean the real unlikely thing is that we return VM_FAULT_OOM, if we
> end up calling __pte_alloc or not, depends on the app. Generally it
> sounds more frequent that the pte is not none, so it's not wrong, but
> it's even less likely that __pte_alloc fails so that can be taken into
> account too, and __pte_alloc runs still quite frequently. So either
> above or:
>
> if (unlikely(pmd_none(*pmd)) && unlikely(__pte_alloc(mm, vma, pmd, address)))
>

I'd prefer this than putting everything inside the same unlikely block.
But if this makes a noticeable, why do we not do it for pte_alloc_map,
pmd_alloc and other similar functions?

> I generally prefer unlikely only when it's 100% sure thing it's less
> likely (like the VM_FAULT_OOM), so the first version I guess it's
> enough (I'm afraid unlikely for pte_none too, may make gcc generate a
> far away jump possibly going out of l1 icache for a case that is only
> 512 times less likely at best). My point is that it's certainly hugely
> more unlikely that __pte_alloc fails than the pte is none.
>

For the bug fix, it's best to match what pte_alloc_map, pmd_alloc,
pud_alloc and others do in terms of how it uses unlikely. If what we are
currently doing is sub-optimal, a single patch should change all the
helpers.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/