Re: [PATCH v3] libnvdimm, region: sysfs trigger for nvdimm_flush()

From: Jeff Moyer
Date: Wed Apr 26 2017 - 16:39:17 EST


Dan Williams <dan.j.williams@xxxxxxxxx> writes:

> The nvdimm_flush() mechanism helps to reduce the impact of an ADR
> (asynchronous-dimm-refresh) failure. The ADR mechanism handles flushing
> platform WPQ (write-pending-queue) buffers when power is removed. The
> nvdimm_flush() mechanism performs that same function on-demand.
>
> When a pmem namespace is associated with a block device, an
> nvdimm_flush() is triggered with every block-layer REQ_FUA, or REQ_FLUSH
> request. These requests are typically associated with filesystem
> metadata updates. However, when a namespace is in device-dax mode,
> userspace (think database metadata) needs another path to perform the
> same flushing. In other words this is not required to make data
> persistent, but in the case of metadata it allows for a smaller failure
> domain in the unlikely event of an ADR failure.
>
> The new 'flush' attribute is visible when the individual DIMMs backing a
> given interleave-set are described by platform firmware. In ACPI terms
> this is "NVDIMM Region Mapping Structures" and associated "Flush Hint
> Address Structures". Reads return "1" if the region supports triggering
> WPQ flushes on all DIMMs. Reads return "0" the flush operation is a
> platform nop, and in that case the attribute is read-only.

I can make peace with exposing this to userspace, though I am mostly
against its use. However, sysfs feels like the wrong interface.
Believe it or not, I'd rather see this implemented as an ioctl.

This isn't a NACK, it's me giving my opinion. Do with it what you will.

Cheers,
Jeff