Re: vma_merge issue

From: Hugh Dickins
Date: Thu Aug 13 2009 - 13:33:31 EST


On Wed, 12 Aug 2009, William R Speirs wrote:
> Hugh Dickins wrote:
> >
> > MADV_DONTNEED: brilliant idea, what a shame it doesn't work for you.
> > I'd been on the point of volunteering a bugfix to it to do what you
> > want, it would make sense; but there's a big but... we have sold
> > MADV_DONTNEED as an madvise that only needs non-exclusive access
> > to the mmap_sem, which means it can be used concurrently with faulting,
> > which has made it much more useful to glibc (I believe). If we were
> > to fiddle with vmas and accounting and merging in there, it would go
> > back to needing exclusive mmap_sem, which would hurt important users.
>
> For my own edification, hurt these users how? Performance? Serializing access
> during a MADV_DONTNEED? I wonder how big the "hurt" would be?

Performance, yes: serializing, yes.

I forget the details, others will have paid closer attention, I may
be making this up! But it was something like garbage collection when
when freeing mallocs: it pays off if faults elsewhere in the address
space can occur concurrently, but bad news if exclusive mmap_sem
locks out those faults. Big enough hurt to show up very badly in
some reallife multithreaded apps, and benchmarks hitting the issue.

> > A "refinement" to that suggestion is to put the file on tmpfs:
> > you will then get charged for RAM+swap as you use it, but you can
> > use madvise MADV_REMOVE to unmap pages, punching holes in the file,
> > freeing up those charges. A little baroque, but I think it does
> > amount to a way of doing exactly what you wanted in the first place.
>
> I like this (the refined) idea a lot. I coded it up and works as expected,
> and the way I initially want.
>
> Thanks for taking the time and providing the solution... I appreciate it.

I'm very glad to hear that worked out: thanks for reporting back.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/