Re: [PATCH 0/3]HTLB mapping for drivers (take 2)

From: Mel Gorman
Date: Wed Aug 26 2009 - 06:05:27 EST

On Wed, Aug 26, 2009 at 07:58:05PM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2009-08-25 at 12:10 +0100, Mel Gorman wrote:
> > On Tue, Aug 25, 2009 at 09:00:54PM +1000, Benjamin Herrenschmidt wrote:
> > > On Tue, 2009-08-25 at 11:47 +0100, Mel Gorman wrote:
> > >
> > > > Why? One hugepage of default size will be one TLB entry. Each hugepage
> > > > after that will be additional TLB entries so there is no savings on
> > > > translation overhead.
> > > >
> > > > Getting contiguous pages beyond the hugepage boundary is not a matter
> > > > for GFP flags.
> > >
> > > Note: This patch reminds me of something else I had on the backburner
> > > for a while and never got a chance to actually implement...
> > >
> > > There's various cases of drivers that could have good uses of hugetlb
> > > mappings of device memory. For example, framebuffers.
> > >
> >
> > Where is the buffer located? If it's in kernel space, than any contiguous
> > allocation will be automatically backed by huge PTEs. As framebuffer allocation
> > is probably happening early in boot, just calling alloc_pages() might do?
> It's not a memory buffer, it's MMIO space (device memory, off your PCI
> bus for example).

Ah right, so you just want to set up huge PTEs within the MMIO space?

> > Adam Litke at one point posted a pagetable-abstraction that would have
> > been the first step on a path like this. It hurt the normal fastpath
> > though and was ultimately put aside.
> Which is why I think we should stick to just splitting hugetlb which
> will not affect the normal path at all. Normal path for normal page,
> HUGETLB VMAs for other sizes, whether they are backed with memory or by
> anything else.

Yeah, in this case I see why you want a hugetlbfs VMA, a huge-pte-backed VMA
and everything else. They are treated differently. I don't think it's exactly
what is required in the thread there though because there is a RAM-backed
buffer. For that, hugetlbfs still makes sense just to ensure the reservations
exist so that faults do not spuriously fail. MMIO doesn't care because the
physical backing exists and is vaguely similar to MAP_SHARED.

> > It's the sort of thing that has been resisted in the past, largely
> > because the only user at the time was about transparent hugepage
> > promotion/demotion. It would need to be a really strong incentive to
> > revive the effort.
> Why ? I'm not proposing to hack the normal path. Just splitting
> hugetlbfs in two which is reasonably easy to do, to allow drivers who
> map large chunks of MMIO space to use larger page sizes.

That is a bit more reasonable. It would help the case of MMIO for sure.

> This is the case of pretty much any discrete video card, a chunk of
> RDMA-style devices, and possibly more.
> It's a reasonably simple change that has 0 effect on the non-hugetlb
> path. I think I'll just have to bite the bullet and send a demo patch
> when I'm no longer bogged down :-)

Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at