Re: [PATCH v5 4/5] fs, xfs: introduce MAP_DIRECT for creating block-map-atomic file ranges

From: Dan Williams
Date: Wed Aug 16 2017 - 13:28:27 EST


On Wed, Aug 16, 2017 at 12:44 AM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> MAP_DIRECT is an mmap(2) flag with the following semantics:
>
> MAP_DIRECT
> When specified with MAP_SHARED a successful fault in this range
> indicates that the kernel is maintaining the block map (user linear
> address to file offset to physical address relationship) in a manner
> that no external agent can observe any inconsistent changes. In other
> words, the block map of the mapping is effectively pinned, or the kernel
> is otherwise able to exchange a new physical extent atomically with
> respect to any hardware / software agent. As implied by this definition
> a successful fault in a MAP_DIRECT range bypasses kernel indirections
> like the page-cache, and all updates are carried directly through to the
> underlying file physical blocks (modulo cpu cache effects).
>
> ETXTBSY may be returned to any third party operation on the file that
> attempts to update the block map (allocate blocks / convert unwritten
> extents / break shared extents). However, whether a filesystem returns
> EXTBSY for a certain state of the block relative to a MAP_DIRECT mapping
> is filesystem and kernel version dependent.
>
> Some filesystems may extend these operation restrictions outside the
> mapped range and return ETXTBSY to any file operations that might mutate
> the block map. MAP_DIRECT faults may fail with a SIGBUS if the
> filesystem needs to write the block map to satisfy the fault. For
> example, if the mapping was established over a hole in a sparse file.
>
> ERRORS
> EACCES A MAP_DIRECT mapping was requested and PROT_WRITE was not set,
> or the requesting process is missing CAP_LINUX_IMMUTABLE.
>
> EINVAL MAP_ANONYMOUS or MAP_PRIVATE was specified with MAP_DIRECT.
>
> EOPNOTSUPP The filesystem explicitly does not support the flag
>
> SIGBUS Attempted to write a MAP_DIRECT mapping at a file offset that
> might require block-map updates.
>
> Cc: Jan Kara <jack@xxxxxxx>
> Cc: Jeff Moyer <jmoyer@xxxxxxxxxx>
> Cc: Christoph Hellwig <hch@xxxxxx>
> Cc: Dave Chinner <david@xxxxxxxxxxxxx>
> Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
> Cc: "Darrick J. Wong" <darrick.wong@xxxxxxxxxx>
> Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> ---
[..]
> diff --git a/include/linux/mman.h b/include/linux/mman.h
> index 0e1de42c836f..7c9e3d11027f 100644
> --- a/include/linux/mman.h
> +++ b/include/linux/mman.h
> @@ -7,16 +7,6 @@
> #include <linux/atomic.h>
> #include <uapi/linux/mman.h>
>
> -#ifndef MAP_32BIT
> -#define MAP_32BIT 0
> -#endif
> -#ifndef MAP_HUGE_2MB
> -#define MAP_HUGE_2MB 0
> -#endif
> -#ifndef MAP_HUGE_1GB
> -#define MAP_HUGE_1GB 0
> -#endif

This was inadvertent, we need this to build on non-x86 archs, will fix.