Re: regression introduced by "block: Add support for DAX reads/writes to block devices"

From: Boaz Harrosh
Date: Sun Aug 09 2015 - 04:53:05 EST


On 08/06/2015 11:34 PM, Dave Chinner wrote:
> On Thu, Aug 06, 2015 at 10:52:47AM +0300, Boaz Harrosh wrote:
>> On 08/06/2015 06:24 AM, Dave Chinner wrote:
>>> On Wed, Aug 05, 2015 at 09:42:54PM -0400, Linda Knippers wrote:
>>>> On 08/05/2015 06:01 PM, Dave Chinner wrote:
>>>>> On Wed, Aug 05, 2015 at 04:19:08PM -0400, Jeff Moyer wrote:
>> <>
>>>>>>
>>>>>> I sat down with Linda to look into it, and the problem is that mkfs.xfs
>>>>>> sets the blocksize of the device to 512 (via BLKBSZSET), and then reads
>>>>>> from the last sector of the device. This results in dax_io trying to do
>>>>>> a page-sized I/O at 512 bytes from the end of the device.
>>>>>
>>
>> This part I do not understand. how is mkfs.xfs reading the sector?
>> Is it through open(/dev/pmem0,...) ? O_DIRECT?
>
> mkfs.xfs uses O_DIRECT. Only if open(O_DIRECT) fails or mkfs.xfs is
> told that it is working on an image file does it fall back to
> buffered IO. All of the XFS userspace tools work this way to prevent
> page cache pollution issues with read-once or write-once data during
> operation.
>

Thanks, yes makes sense. This is a bug at the DAX implementation of
bdev. Since as you know with DAX there is no difference between
O_DIRECT and buffered, we must support any aligned IO. I bet it
should be something with bdev not giving 4K buffer-heads to dax.c.

Or ... It might just be the infamous bug where the actual partition
they used was not 4k aligned on its start sector. So the last sector IO
after partition translation came out wrong. This bug then should be
fixed by: https://lists.01.org/pipermail/linux-nvdimm/2015-July/001555.html
by:Vishal Verma

Vishal I think we should add CC: stable@xxxxxxxxxxxxxxx to your patch
because of these fdisk bugs.

> Cheers,
> Dave.

Thanks
Boaz

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/