Re: [patch] rfc: introduce /dev/hugetlb

From: Andrew Morton
Date: Sat Mar 24 2007 - 03:39:55 EST


On Sat, 24 Mar 2007 00:11:32 -0700 "Ken Chen" <kenchen@xxxxxxxxxx> wrote:

> On 3/23/07, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > a) Ken observes that obtaining private hugetlb memory via hugetlbfs
> > involves "fuss".
> >
> > b) the libhugetlbfs maintainers then go off and implement a no-fuss way of
> > doing this.
>
> Hmm, what started this thread was libhugetlbfs maintainer complained
> how "fuss" it was to create private hugetlb mapping and suggested an
> even bigger kernel change with pagetable_operations API.

OK. I wasn't paying particularly close attention. But my rant still
stands ;)

> The new API
> was designed with an end goal of introduce /dev/hugetlb (as one of the
> feature, they might be thinking more). What motivated me here is to
> point out that we can achieve the same goal of having a /dev/hugetlb
> with existing hugetlbfs infrastructure and the implementation is
> relatively straightforward. What it also buys us is a bit more
> flexibility to the end user who wants to use the interface directly.

OK.

Why is it a "fuss" to do this with hugetlbfs files, btw?

Having read back through the thread, the only substantiation I can really
see is

The pagetable_operations API opens up possibilities to do some
additional (and completely sane) things. For example, I have a patch
that alters the character device code below to make use of a hugetlb
ZERO_PAGE. This eliminates almost all the up-front fault time, allowing
pages to be COW'ed only when first written to. We cannot do things like
this with hugetlbfs anymore because we have a set of complex semantics to
preserve.


Why is this actually a useful feature?

What does "complex semantics to preserve" mean?


I dunno. I see a lot of code flying around, but comparatively little
effort to describe the actual problems which we're trying to solve.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/