Re: [PATCH v2 06/17] mm: doc comment for scary spot in write_one_page

From: NeilBrown
Date: Wed Apr 12 2017 - 17:37:06 EST


On Wed, Apr 12 2017, Jeff Layton wrote:

> On Wed, 2017-04-12 at 07:38 -0700, Matthew Wilcox wrote:
>> On Wed, Apr 12, 2017 at 09:01:34AM -0400, Jeff Layton wrote:
>> > On Wed, 2017-04-12 at 08:06 -0400, Jeff Layton wrote:
>> > > Not sure what to do here just yet.
>> > >
>> > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
>> > > ---
>> > > mm/page-writeback.c | 6 ++++++
>> > > 1 file changed, 6 insertions(+)
>> > >
>> > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> > > index de0dbf12e2c1..3ac8399dc984 100644
>> > > --- a/mm/page-writeback.c
>> > > +++ b/mm/page-writeback.c
>> > > @@ -2388,6 +2388,12 @@ int write_one_page(struct page *page)
>> > > ret = mapping->a_ops->writepage(page, &wbc);
>> > > if (ret == 0) {
>> > > wait_on_page_writeback(page);
>> > > + /*
>> > > + * FIXME: is this racy? What guarantees that PG_error
>> > > + * will still be set once we get around to checking it?
>> > > + * What if writeback fails, but then a read is issued
>> > > + * before we check this, and that calls ClearPageError?
>> > > + */
>> > > if (PageError(page))
>> > > ret = -EIO;
>> > > }
>> >
>> > Ahh, we are always under the page lock here, and this is generally used
>> > for writing out directory pages anyway. I'm fine with dropping this
>> > patch unless someone else sees a problem here.
>>
>> ->writepage drops the page lock. We're still holding a refcount on this
>> page, but that's not going to prevent read being called. But maybe the
>> filesystem won't call read on a page that's marked as PageError?
>
> Hard to be sure there. I really wonder if that check is needed at all,
> the more I look at it. After all, we are calling writepage with
> WB_SYNC_ALL so we should get an error there.

WB_SYNC_ALL doesn't cause writepage to wait. It might case it to ask
for REQ_SYNC, so the write requests gets priority in the block layer.
WB_SYNC_ALL does cause writepages (with an 's') to wait.
(At least, that is how I read the code).

>
> Is it also possible these pages could be written back before that point
> (due to memory pressure or something) and that fail?

Probably, in which case clear_page_dirty_for_io() will fail and
write_one_page() will just unlock the page.

>
> Maybe we should just have a call to filemap_check_errors on exiting
> this function?

I'm leaning in that direction.

>
> With the the wb_err_t based stuff, we could change it to sample the
> wb_err early, and then use that to see if an error has occurred since
> then. Maybe we should even allow callers to pass a wb_err_t in here, so
> we can report errors that have occurred since a known point?

That feels to me like over-engineering. We would need to
unconditionally call writepage() for that to work.

We seem to be agreed that write errors for buffered writes are reported
per-address-space. To get per-page errors you have to use direct IO.
Let's focus on that policy and make it work.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature