Re: [RFC PATCH] mm/thp: Always allocate transparent hugepages on local node

From: Kirill A. Shutemov
Date: Tue Nov 25 2014 - 09:17:18 EST


On Mon, Nov 24, 2014 at 01:33:42PM -0800, David Rientjes wrote:
> On Mon, 24 Nov 2014, Kirill A. Shutemov wrote:
>
> > > This make sure that we try to allocate hugepages from local node. If
> > > we can't we fallback to small page allocation based on
> > > mempolicy. This is based on the observation that allocating pages
> > > on local node is more beneficial that allocating hugepages on remote node.
> >
> > Local node on allocation is not necessary local node for use.
> > If policy says to use a specific node[s], we should follow.
> >
>
> True, and the interaction between thp and mempolicies is fragile: if a
> process has a MPOL_BIND mempolicy over a set of nodes, that does not
> necessarily mean that we want to allocate thp remotely if it will always
> be accessed remotely. It's simple to benchmark and show that remote
> access latency of a hugepage can exceed that of local pages. MPOL_BIND
> itself is a policy of exclusion, not inclusion, and it's difficult to
> define when local pages and its cost of allocation is better than remote
> thp.
>
> For MPOL_BIND, if the local node is allowed then thp should be forced from
> that node, if the local node is disallowed then allocate from any node in
> the nodemask. For MPOL_INTERLEAVE, I think we should only allocate thp
> from the next node in order, otherwise fail the allocation and fallback to
> small pages. Is this what you meant as well?

Correct.

> > I think it makes sense to force local allocation if policy is interleave
> > or if current node is in preferred or bind set.
> >
>
> If local allocation were forced for MPOL_INTERLEAVE and all memory is
> initially faulted by cpus on a single node, then the policy has
> effectively become MPOL_DEFAULT, there's no interleave.

You're right. I don't have much experience with mempolicy code.

--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/