Re: [PATCH] Describe race of direct read and fork for unalignedbuffers

From: Jan Kara
Date: Wed May 02 2012 - 05:18:36 EST


On Wed 02-05-12 19:09:54, Nick Piggin wrote:
> On 2 May 2012 18:17, Jan Kara <jack@xxxxxxx> wrote:
> > On Wed 02-05-12 01:50:46, Nick Piggin wrote:
>
> >> KOSAKI-san is correct, I think.
> >>
> >> The race is something like this:
> >>
> >> DIO-read
> >>     page = get_user_pages()
> >>                                                         fork()
> >>                                                             COW(page)
> >>                                                          touch(page)
> >>     DMA(page)
> >>     page_cache_release(page);
> >>
> >> So whether parent or child touches the page, determines who gets the
> >> actual DMA target, and who gets the copy.
> >  OK, this is roughly what I understood from original threads as well. So
> > if our buffer is page aligned and its size is page aligned, you would hit
> > the corruption only if you do modify the buffer while IO to / from that buffer
> > is in progress. And that would seem like a really bad programming practice
> > anyway. So I still believe that having everything page size aligned will
> > effectively remove the problem although I agree it does not aim at the core
> > of it.
>
> I see what you mean.
>
> I'm not sure, though. For most apps it's bad practice I think. If you get into
> realm of sophisticated, performance critical IO/storage managers, it would
> not surprise me if such concurrent buffer modifications could be allowed.
> We allow exactly such a thing in our pagecache layer. Although probably
> those would be using shared mmaps for their buffer cache.
>
> I think it is safest to make a default policy of asking for IOs against private
> cow-able mappings to be quiesced before fork, so there are no surprises
> or reliance on COW details in the mm. Do you think?
Yes, I agree that (and MADV_DONTFORK) is probably the best thing to have
in documentation. Otherwise it's a bit too hairy...

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/