Re: NFS infinite loop in filemap_fault()

From: Josef Bacik
Date: Thu May 08 2008 - 09:22:18 EST


On Thu, May 08, 2008 at 08:47:09AM +0200, Miklos Szeredi wrote:
> > Page fault on NFS apparently goes into an infinite loop if the read on
> > the server fails.
> >
> > I don't understand the NFS readpage code, but the filemap_fault() code
> > looks somewhat suspicious:
> >
> > /*
> > * Umm, take care of errors if the page isn't up-to-date.
> > * Try to re-read it _once_. We do this synchronously,
> > * because there really aren't any performance issues here
> > * and we need to check for errors.
> > */
> > ClearPageError(page);
> > error = mapping->a_ops->readpage(file, page);
> > page_cache_release(page);
> >
> > if (!error || error == AOP_TRUNCATED_PAGE)
> > goto retry_find;
> >
> > The comment doesn't seem to match what the it actually does: if
> > ->readpage() is asynchronous, then this will just repeat everything,
> > without any guarantee that it will re-read once.
>
> This patch fixes it. It's probably wrong in some subtle way though...
>
> Miklos
>
>
> ---
> mm/filemap.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> Index: linux.git/mm/filemap.c
> ===================================================================
> --- linux.git.orig/mm/filemap.c 2008-05-08 08:17:22.000000000 +0200
> +++ linux.git/mm/filemap.c 2008-05-08 08:19:59.000000000 +0200
> @@ -1461,6 +1461,12 @@ page_not_uptodate:
> */
> ClearPageError(page);
> error = mapping->a_ops->readpage(file, page);
> + if (!error && !PageUptodate(page)) {

Shouldn't you have (!error || error != AOP_TRUNCATED_PAGE), since the fs can
return AOP_TRUNCATED_PAGE if it needs vfs to try the readpage again? Things
like OCFS2/GFS2 do this, they send of a lock request and return
AOP_TRUNCATED_PAGE so that when we come back into readpage we are already
holding our lock without blocking while holding the page lock. Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/