Re: [RFC v4 0/3] Support volatile for anonymous range

From: Minchan Kim
Date: Wed Dec 19 2012 - 20:34:44 EST


On Tue, Dec 18, 2012 at 10:27:46AM -0800, Arun Sharma wrote:
> On 12/17/12 10:47 PM, Minchan Kim wrote:
>
> >I hope more inputs from user-space allocator people and test patch
> >with their allocator because it might need design change of arena
> >management for getting real vaule.
>
> jemalloc knows how to handle MADV_FREE on platforms that support it.
> This looks similar (we'll need a SIGBUS handler that does the right
> thing = zero the page + mark it as non-volatile in the common case).

Don't work because it's too late to mark it as non-volatile in signal
handler in case of malloc.

For example,
free(P1-P4) -> mvolatile(P1-P4) -> VM discard(P3) -> alloc(P1-P4) ->
use P1 -> VM discard(P1) -> use P3 -> SIGBUS -> mark nonvolatile ->
lost P1.

So, we should call mnovolatile before giving the free space to user.

>
> All of this of course assumes that apps madvise the kernel through
> APIs exposed by the malloc implementation - not via a raw syscall.
>
> In other words, some new user space code needs to be written to test

Agreed. I might want to design new allocator with this system calls if
existing allocators cannot use this system calls efficiently because it
might need allocator's design change. MADV_FREE/MADV_DONTNEED isn't cheap
due to enumerating ptes/page descriptors in that range to mark something
so I guess allocator avoids frequent calling of the such advise system call
and even if they call it, they want to call the big range as batch.
Just my imagine.

But mvolatile/mnovolatile is cheaper so you can call it more frequently
with smaller range so VM could have easy-reclaimable pages easily.
Another benefit of the mvolatile is it can change the behavior when memory
pressure is severe where it can zap all pages like DONTNEED so it could
work very flexible.
The downside of that approach is that if we call it with small range,
it can increase the number of VMA so we might tune point for VMA size.

> this out fully. Sounds feasible though.

Thanks!

>
> -Arun
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/