Re: open(2) says O_DIRECT works on 512 byte boundries?

From: KOSAKI Motohiro
Date: Thu Jan 29 2009 - 02:10:55 EST


(CC to andrea)

> On Wed, 28 Jan 2009 13:33:22 -0800
> Greg KH <greg@xxxxxxxxx> wrote:
>
> > In looking at open(2), it says that O_DIRECT works on 512 byte boundries
> > with the 2.6 kernel release:
> > Under Linux 2.4, transfer sizes, and the alignment of the user
> > buffer and the file offset must all be multiples of the logical
> > block size of the file system. Under Linux 2.6, alignment to
> > 512-byte boundaries suffices.
> >
> > However if you try to access an O_DIRECT opened file with a buffer that
> > is PAGE_SIZE aligned + 512 bytes, it fails in a bad way (wrong data is
> > read.)
> >
>
> IIUC, it's not related to 512bytes boundary. Just a race between
> direct-io v.s. copy-on-write. Copy-on-Write while reading a page via DIO
> is a problem.

Yes.
Greg's reproducer is a bit misleading.

> for (j = 0; j < workers; j++) {
> worker[j].offset = offset + j * PAGE_SIZE;
> worker[j].buffer = buffer + align + j * PAGE_SIZE;
> worker[j].length = PAGE_SIZE;
> }

this code mean,
- if align == 0, reader thread touch only one page.
and the page is touched only one thread.
- if align != 0, reader thread touch two page.
and the page is touched two thread.

then, race is happend if align != 0.
We discussed this issue with andrea last month.
("Corruption with O_DIRECT and unaligned user buffers" thread)

As far as I know, he is working on fixing this issue now.


>
> Maybe it's true that if buffer is aligned to page size, no copy-on-write will
> happen in usual program. But assuming HugeTLB page, which does Copy-on-Write,
> data corruption will happen again. HugeTLB aligned buffer is nonsense.





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/