Re: [PATCH linux-2.6.1-rc1-mm1] filemap_fdatawait.patch

From: Andrew Morton
Date: Thu Jan 01 2004 - 23:47:02 EST


Suparna Bhattacharya <suparna@xxxxxxxxxx> wrote:
>
> On Wed, Dec 31, 2003 at 03:42:39PM -0800, Andrew Morton wrote:
> > Daniel McNeil <daniel@xxxxxxxx> wrote:
> > >
> > > The other potential race in filemap_fdatawait() was that it
> > > removed the page from the locked list while waiting for a writeback
> > > and if there was a 2nd filemap_fdatawait() running on another cpu,
> > > it would not wait for the page being written since it would never see
> > > it on the list.
> >
> > That would only happen if one thread or the other was not running under
> > i_sem. The only path I see doing that is in generic_file_direct_IO()?
>
> Yes, and we should simply fix generic_file_direct_IO to avoid doing so.
> We anyway issue filemap_fdatawait later with i_sem held.
>
> The race that we need to worry about is between background writeouts
> (which don't take i_sem) and filemap_fdatawrite/filemap_fdatawait - i.e
> the first one discussed.

Well Daniel has raised a second race here, betwen filemap_fdatawait() and
filemap_fdatwait(). The background writeback code does not execute
filemap_datawait() anyway, so no prob.

Yes, extending i_sem coverage in the case O_DIRECT reads should suit. In
the (vastly) common case mapping->nrpages is zero anyway, so we shouldn't
even enter that code.

> > > + /*
> > > + * If the page is locked, it might be in process of being
> > > + * setup for writeback but without PG_writeback set
> > > + * and with PG_dirty cleared.
> > > + * (PG_dirty is cleared BEFORE PG_writeback is set)
> > > + * So, wait for the PG_locked to clear, then start over.
> > > + */
> > > + if (PageLocked(page)) {
> > > + page_cache_get(page);
> > > + spin_unlock(&mapping->page_lock);
> > > + wait_on_page_locked(page);
> > > + page_cache_release(page);
> > > + goto restart;
> > > + }
> >
> > Why is this a problem which needs addressing here? If some other thread is
> > in the process of starting I/O against this page then the page must have
> > been clean when this thread ran filemap_fdatawrite()?
>
> This is the same race that we have been discussing (background writer
> pulled this page off io_pages, put it on locked pages but hasn't set
> PG_writeback as yet). To me it seemed that Daniel's solution was just an
> alternative to what you proposed - i.e. adding lock_page() to filemap_fdatawait.
> I have to think a little about the fix -- AFAICS but we are all talking
> about the same (real) problem here.

Yup. Realish, anyway. Unless we can demonstrate that this is the cause of
the O_DIRECT data-exposure problems, this race isn't really very
interesting. It should be plugged though I guess.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/