Re: [PATCH v2] nd_blk: add support for "read flush" DSM flag

From: Dan Williams
Date: Thu Aug 20 2015 - 13:59:51 EST


On Thu, Aug 20, 2015 at 9:44 AM, Ross Zwisler
<ross.zwisler@xxxxxxxxxxxxxxx> wrote:
> On Wed, 2015-08-19 at 16:06 -0700, Dan Williams wrote:
>> On Wed, Aug 19, 2015 at 3:48 PM, Ross Zwisler
>> <ross.zwisler@xxxxxxxxxxxxxxx> wrote:
>> > Add support for the "read flush" _DSM flag, as outlined in the DSM spec:
>> >
>> > http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
>> >
>> > This flag tells the ND BLK driver that it needs to flush the cache lines
>> > associated with the aperture after the aperture is moved but before any
>> > new data is read. This ensures that any stale cache lines from the
>> > previous contents of the aperture will be discarded from the processor
>> > cache, and the new data will be read properly from the DIMM. We know
>> > that the cache lines are clean and will be discarded without any
>> > writeback because either a) the previous aperture operation was a read,
>> > and we never modified the contents of the aperture, or b) the previous
>> > aperture operation was a write and we must have written back the dirtied
>> > contents of the aperture to the DIMM before the I/O was completed.
>> >
>> > By supporting the "read flush" flag we can also change the ND BLK
>> > aperture mapping from write-combining to write-back via memremap().
>> >
>> > In order to add support for the "read flush" flag I needed to add a
>> > generic routine to invalidate cache lines, mmio_flush_range(). This is
>> > protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
>> > only supported on x86.
>> >
>> > Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
>> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
>> [..]
>> > diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
>> > index 7c2638f..56fff01 100644
>> > --- a/drivers/acpi/nfit.c
>> > +++ b/drivers/acpi/nfit.c
>> [..]
>> > static int acpi_nfit_blk_single_io(struct nfit_blk *nfit_blk,
>> > @@ -1078,11 +1078,16 @@ static int acpi_nfit_blk_single_io(struct nfit_blk *nfit_blk,
>> > }
>> >
>> > if (rw)
>> > - memcpy_to_pmem(mmio->aperture + offset,
>> > + memcpy_to_pmem(mmio->addr.aperture + offset,
>> > iobuf + copied, c);
>> > - else
>> > + else {
>> > + if (nfit_blk->dimm_flags & ND_BLK_READ_FLUSH)
>> > + mmio_flush_range((void __force *)
>> > + mmio->addr.aperture + offset, c);
>> > +
>> > memcpy_from_pmem(iobuf + copied,
>> > - mmio->aperture + offset, c);
>> > + mmio->addr.aperture + offset, c);
>> > + }
>>
>> Why is the flush inside the "while (len)" loop? I think it should be
>> done immediately after the call to write_blk_ctl() since that is the
>> point at which the aperture becomes invalidated, and not prior to each
>> read within a given aperture position. Taking it a bit further, we
>> may be writing the same address into the control register as was there
>> previously so we wouldn't need to flush in that case.
>
> The reason I was doing it in the "while (len)" loop is that you have to walk
> through the interleave tables, reading each segment until you have read 'len'
> bytes. If we were to invalidate right after the write_blk_ctl(), we would
> essentially have to re-create the "while (len)" loop, hop through all the
> segments doing the invalidation, then run through the segments again doing the
> actual I/O.
>
> It seemed a lot cleaner to just run through the segments once, invalidating
> and reading each segment individually.

I agree it's cleaner if it is considering the de-interleave, but why
consider interleave at all? In other words just flush the entire
aperture unconditionally. Regardless of whether it reads all of the
aperture it is indeed invalid because the aperture has moved. I'm not
seeing the benefit of being careful to let stale data stay in the
cache a bit longer.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/