Re: [PATCH v3] libnvdimm, region: sysfs trigger for nvdimm_flush()

From: Jeff Moyer
Date: Thu Apr 27 2017 - 09:45:40 EST


Dan Williams <dan.j.williams@xxxxxxxxx> writes:

> On Wed, Apr 26, 2017 at 1:38 PM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
>> Dan Williams <dan.j.williams@xxxxxxxxx> writes:
>>
>>> The nvdimm_flush() mechanism helps to reduce the impact of an ADR
>>> (asynchronous-dimm-refresh) failure. The ADR mechanism handles flushing
>>> platform WPQ (write-pending-queue) buffers when power is removed. The
>>> nvdimm_flush() mechanism performs that same function on-demand.
>>>
>>> When a pmem namespace is associated with a block device, an
>>> nvdimm_flush() is triggered with every block-layer REQ_FUA, or REQ_FLUSH
>>> request. These requests are typically associated with filesystem
>>> metadata updates. However, when a namespace is in device-dax mode,
>>> userspace (think database metadata) needs another path to perform the
>>> same flushing. In other words this is not required to make data
>>> persistent, but in the case of metadata it allows for a smaller failure
>>> domain in the unlikely event of an ADR failure.
>>>
>>> The new 'flush' attribute is visible when the individual DIMMs backing a
>>> given interleave-set are described by platform firmware. In ACPI terms
>>> this is "NVDIMM Region Mapping Structures" and associated "Flush Hint
>>> Address Structures". Reads return "1" if the region supports triggering
>>> WPQ flushes on all DIMMs. Reads return "0" the flush operation is a
>>> platform nop, and in that case the attribute is read-only.
>>
>> I can make peace with exposing this to userspace, though I am mostly
>> against its use. However, sysfs feels like the wrong interface.
>> Believe it or not, I'd rather see this implemented as an ioctl.
>>
>> This isn't a NACK, it's me giving my opinion. Do with it what you will.
>
> I hate ioctls with a burning passion so I can't get on board with that
> change, but perhaps the sentiment behind it is that this is too
> visible and too attractive being called "flush" in sysfs? Would a name
> more specific to the mechanism make it more palatable? Like
> "flush_hint_trigger" or "wpq_drain"?

The sentiment is that programs shouldn't have to grovel around in sysfs
to do stuff related to an open file descriptor or mapping. I don't take
issue with the name. I do worry that something like 'wpq_drain' may be
too platform specific, though. The NVM Programming Model specification
is going to call this "deep flush", so maybe that will give you
some inspiration if you do want to change the name.

Cheers,
Jeff