Re: [PATCH 05/14] vrange: Add new vrange(2) system call

From: H. Peter Anvin
Date: Mon Oct 07 2013 - 18:57:26 EST


On 10/02/2013 05:51 PM, John Stultz wrote:
> From: Minchan Kim <minchan@xxxxxxxxxx>
>
> This patch adds new system call sys_vrange.
>
> NAME
> vrange - Mark or unmark range of memory as volatile
>

vrange() is about as nondescriptive as one can get -- there is exactly
one letter that has any connection with that this does.

> SYNOPSIS
> int vrange(unsigned_long start, size_t length, int mode,
> int *purged);
>
> DESCRIPTION
> Applications can use vrange(2) to advise the kernel how it should
> handle paging I/O in this VM area. The idea is to help the kernel
> discard pages of vrange instead of reclaiming when memory pressure
> happens. It means kernel doesn't discard any pages of vrange if
> there is no memory pressure.
>
> mode:
> VRANGE_VOLATILE
> hint to kernel so VM can discard in vrange pages when
> memory pressure happens.
> VRANGE_NONVOLATILE
> hint to kernel so VM doesn't discard vrange pages
> any more.
>
> If user try to access purged memory without VRANGE_NOVOLATILE call,
> he can encounter SIGBUS if the page was discarded by kernel.
>
> purged: Pointer to an integer which will return 1 if
> mode == VRANGE_NONVOLATILE and any page in the affected range
> was purged. If purged returns zero during a mode ==
> VRANGE_NONVOLATILE call, it means all of the pages in the range
> are intact.

I'm a bit confused about the "purged"

>From an earlier version of the patch:

> - What's different with madvise(DONTNEED)?
>
> System call semantic
>
> DONTNEED makes sure user always can see zero-fill pages after
> he calls madvise while vrange can see data or encounter SIGBUS.

This difference doesn't seem to be a huge one. The other one seems to
be the blocking status of MADV_DONTNEED, which perhaps may be better
handled by adding an option (MADV_LAZY) perhaps?

That way we would have lazy vs. immediate, and zero versus SIGBUS.

I see from the change history of the patch that this was an madvise() at
some point, but was changed into a separate system call at some point,
does anyone remember why that was? A quick look through my LKML
archives doesn't really make it clear.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/