Re: [PATCH v3 1/1] process_madvise.2: Add process_madvise man page

From: Suren Baghdasaryan
Date: Tue Feb 16 2021 - 12:48:46 EST


Hi Michael,

On Sat, Feb 13, 2021 at 2:04 PM Michael Kerrisk (man-pages)
<mtk.manpages@xxxxxxxxx> wrote:
>
> Hello Suren,
>
> On 2/2/21 11:12 PM, Suren Baghdasaryan wrote:
> > Hi Michael,
> >
> > On Tue, Feb 2, 2021 at 2:45 AM Michael Kerrisk (man-pages)
> > <mtk.manpages@xxxxxxxxx> wrote:
> >>
> >> Hello Suren (and Minchan and Michal)
> >>
> >> Thank you for the revisions!
> >>
> >> I've applied this patch, and done a few light edits.
> >
> > Thanks!
> >
> >>
> >> However, I have a questions about undocumented pieces in *madvise(2)*,
> >> as well as one other question. See below.
> >>
> >> On 2/2/21 6:30 AM, Suren Baghdasaryan wrote:
> >>> Initial version of process_madvise(2) manual page. Initial text was
> >>> extracted from [1], amended after fix [2] and more details added using
> >>> man pages of madvise(2) and process_vm_read(2) as examples. It also
> >>> includes the changes to required permission proposed in [3].
> >>>
> >>> [1] https://lore.kernel.org/patchwork/patch/1297933/
> >>> [2] https://lkml.org/lkml/2020/12/8/1282
> >>> [3] https://patchwork.kernel.org/project/selinux/patch/20210111170622.2613577-1-surenb@xxxxxxxxxx/#23888311
> >>>
> >>> Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> >>> Reviewed-by: Michal Hocko <mhocko@xxxxxxxx>
> >>> ---
> >>> changes in v2:
> >>> - Changed description of MADV_COLD per Michal Hocko's suggestion
> >>> - Applied fixes suggested by Michael Kerrisk
> >>> changes in v3:
> >>> - Added Michal's Reviewed-by
> >>> - Applied additional fixes suggested by Michael Kerrisk
> >>>
> >>> NAME
> >>> process_madvise - give advice about use of memory to a process
> >>>
> >>> SYNOPSIS
> >>> #include <sys/uio.h>
> >>>
> >>> ssize_t process_madvise(int pidfd,
> >>> const struct iovec *iovec,
> >>> unsigned long vlen,
> >>> int advice,
> >>> unsigned int flags);
> >>>
> >>> DESCRIPTION
> >>> The process_madvise() system call is used to give advice or directions
> >>> to the kernel about the address ranges of another process or the calling
> >>> process. It provides the advice to the address ranges described by iovec
> >>> and vlen. The goal of such advice is to improve system or application
> >>> performance.
> >>>
> >>> The pidfd argument is a PID file descriptor (see pidfd_open(2)) that
> >>> specifies the process to which the advice is to be applied.
> >>>
> >>> The pointer iovec points to an array of iovec structures, defined in
> >>> <sys/uio.h> as:
> >>>
> >>> struct iovec {
> >>> void *iov_base; /* Starting address */
> >>> size_t iov_len; /* Number of bytes to transfer */
> >>> };
> >>>
> >>> The iovec structure describes address ranges beginning at iov_base address
> >>> and with the size of iov_len bytes.
> >>>
> >>> The vlen represents the number of elements in the iovec structure.
> >>>
> >>> The advice argument is one of the values listed below.
> >>>
> >>> Linux-specific advice values
> >>> The following Linux-specific advice values have no counterparts in the
> >>> POSIX-specified posix_madvise(3), and may or may not have counterparts
> >>> in the madvise(2) interface available on other implementations.
> >>>
> >>> MADV_COLD (since Linux 5.4.1)
> >>
> >> I just noticed these version numbers now, and thought: they can't be
> >> right (because the system call appeared only in v5.11). So I removed
> >> them. But, of course in another sense the version numbers are (nearly)
> >> right, since these advice values were added for madvise(2) in Linux 5.4.
> >> However, they are not documented in the madvise(2) manual page. Is it
> >> correct to assume that MADV_COLD and MADV_PAGEOUT have exactly the same
> >> meaning in madvise(2) (but just for the calling process, of course)?
> >
> > Correct. They should be added in the madvise(2) man page as well IMHO.
>
> So, I decided to move the description of MADV_COLD and MADV_PAGEOUT
> to madvise(2) and refer to that page from the process_madvise(2)
> page. This avoids repeating the same information in two places.

Sounds good.

>
> >>> Deactive a given range of pages which will make them a more probable
> >>
> >> I changed: s/Deactive/Deactivate/
> >
> > thanks!
> >
> >>
> >>> reclaim target should there be a memory pressure. This is a
> >>> nondestructive operation. The advice might be ignored for some pages
> >>> in the range when it is not applicable.
> >>>
> >>> MADV_PAGEOUT (since Linux 5.4.1)
> >>> Reclaim a given range of pages. This is done to free up memory occupied
> >>> by these pages. If a page is anonymous it will be swapped out. If a
> >>> page is file-backed and dirty it will be written back to the backing
> >>> storage. The advice might be ignored for some pages in the range when
> >>> it is not applicable.
> >>
> >> [...]
> >>
> >>> The hint might be applied to a part of iovec if one of its elements points
> >>> to an invalid memory region in the remote process. No further elements will
> >>> be processed beyond that point.
> >>
> >> Is the above scenario the one that leads to the partial advice case described in
> >> RETURN VALUE? If yes, perhaps I should add some words to make that clearer.
> >
> > Correct. This describes the case when partial advice happens.
>
> Thanks. I added a few words to clarify this.

Any link where I can see the final version?

>
>
> >> You can see the light edits that I made in
> >> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=e3ce016472a1b3ec5dffdeb23c98b9fef618a97b
> >> and following that I restructured DESCRIPTION a little in
> >> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=3aac0708a9acee5283e091461de6a8410bc921a6
> >
> > The edits LGTM.
>
> Thanks for checking them.
>
> Cheers,
>
> Michael
>

Thanks,
Suren.

>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx.
>