Re: Transparent Hugepage Support #30

From: Andrea Arcangeli
Date: Wed Sep 15 2010 - 09:43:55 EST


Hello,

On Mon, Sep 13, 2010 at 03:04:09PM +0530, Balbir Singh wrote:
> OK, when the code is touched next and from now on, we'll stop making
> that assumption.

Great, thanks!

> Thanks, is there an overhead of the compound_lock that will show up?

The compound lock is a per-page bit spinlock, so it'll surely scale
well, but surely there is a locked op overhead associated to it, but
it will only cost for hugepages, not normal pages.

Hugepages can't be collapsed in place, and they can only be collapsed
under the mmap_sem write mode (so holding the mmap sem in read or
write mode is enough to protect against it). The same can't be said
for the split of an hugepage, hugepages can be splitted under the mmap
sem just fine (the only way to protect against it is the compound_lock
or the anon_vma_lock, or yet another way to avoid the page to be
splitted under us is to local_irq_disable and then call
__get_user_pages_fast like futex.c does, it can't be splitted until
local_irq_enable is called, same guarantee as in gup_fast, the
pmd_splitting_flush_notify will wait, the tlb flush for the splitting
is really useless, it's just there to send an IPI and wait for any
gup_fast to finish). It's not entirely clear right now, what kind of
protection we need in memcg.

> Please do look at it, most of the churn is not controllable since it
> is bug fixes and feature enhancements for newer subsystems and
> performance. We'll try not to break anything fundamental.

Looking at it right now!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/