Re: [NFS] 2.6.17.8 - do_vfs_lock: VFS is out of sync with lock manager!

From: Jesper Juhl
Date: Tue Nov 21 2006 - 07:44:04 EST


On 21/08/06, Trond Myklebust <trond.myklebust@xxxxxxxxxx> wrote:
On Mon, 2006-08-21 at 13:34 +1000, Neil Brown wrote:
> Looking in fs/nfs/file.c (at 2.6.18-rc4-mm1 if it matters, but 2.6.17
> is much the same)
>
> - do_vfs_lock is only called when the filesystem was mounted with
> -o nolock EXCEPT
> - If a lock request to the server in interrupted (when mounted with
> -o intr) then do_vfs_lock is called to try to get the lock
> locally. Normally equivalent code will be called inside
> fs/lockd/clntproc.c when the server replies that the lock has been
> gained. In the case of an interrupt though this doesn't happen
> but the lock may still have happened on the server. So we record
> locally that the lock was gained, to ensure that it gets unlocked
> when the process exits.
>
> As you don't have '-o nolocks' you must be hitting the second case.
> The lock call to the server returns -EINTR or -ERESTARTSYS and
> do_vfs_lock is called just-in-case.
> As this is a just-in-case call, it is quite possible that the lock is
> held by some other process, so getting an error is entirely possible.
> So printing the message in this case seems wrong.
>
> On the other hand, printing the message in any other case seems wrong
> too, as server locking is not being used, so there is nothing to get
> out of sync with.
>
> As a further complication, I don't think that in the just-in-case
> situation that it should risk waiting for the lock.
> Now maybe we can be sure there is a pending signal which will break
> out of any wait (though I'm worried about -ERESTARTSYS - that doesn't
> imply a signal does it?), but I would feel more comfortable if
> FL_SLEEP were turned off in that path.
>
> So: Trond: Any obvious errors in the above?
> Is the following patch ok?

Could we instead replace it with a dprintk() that returns the value of
"res"? That will keep it useful for debugging purposes.


How about the below?
(compile tested only)

Neil: I left your Signed-off-by line since I just modified your patch slightly.

Since Gmail will probably mangle the inline patch, it is attached as well.


Signed-off-by: Neil Brown <neilb@xxxxxxx>
Signed-off-by: Jesper Juhl <jesper.juhl@xxxxxxxxx>
--

fs/nfs/file.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index cc93865..22572af 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -428,8 +428,8 @@ static int do_vfs_lock(struct file *file
BUG();
}
if (res < 0)
- printk(KERN_WARNING "%s: VFS is out of sync with lock
manager!\n",
- __FUNCTION__);
+ dprintk("%s: VFS is out of sync with lock manager (res
= %d)!\n",
+ __FUNCTION__, res);
return res;
}

@@ -479,10 +479,13 @@ static int do_setlk(struct file *filp, i
* we clean up any state on the server. We therefore
* record the lock call as having succeeded in order to
* ensure that locks_remove_posix() cleans it out when
- * the process exits.
+ * the process exits. Make sure not to sleep if
+ * someone else holds the lock.
*/
- if (status == -EINTR || status == -ERESTARTSYS)
+ if (status == -EINTR || status == -ERESTARTSYS) {
+ fl->fl_flags &= ~FL_SLEEP;
do_vfs_lock(filp, fl);
+ }
} else
status = do_vfs_lock(filp, fl);
unlock_kernel();



--
Jesper Juhl <jesper.juhl@xxxxxxxxx>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

Signed-off-by: Neil Brown <neilb@xxxxxxx>
Signed-off-by: Jesper Juhl <jesper.juhl@xxxxxxxxx>
--

fs/nfs/file.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index cc93865..22572af 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -428,8 +428,8 @@ static int do_vfs_lock(struct file *file
BUG();
}
if (res < 0)
- printk(KERN_WARNING "%s: VFS is out of sync with lock manager!\n",
- __FUNCTION__);
+ dprintk("%s: VFS is out of sync with lock manager (res = %d)!\n",
+ __FUNCTION__, res);
return res;
}

@@ -479,10 +479,13 @@ static int do_setlk(struct file *filp, i
* we clean up any state on the server. We therefore
* record the lock call as having succeeded in order to
* ensure that locks_remove_posix() cleans it out when
- * the process exits.
+ * the process exits. Make sure not to sleep if
+ * someone else holds the lock.
*/
- if (status == -EINTR || status == -ERESTARTSYS)
+ if (status == -EINTR || status == -ERESTARTSYS) {
+ fl->fl_flags &= ~FL_SLEEP;
do_vfs_lock(filp, fl);
+ }
} else
status = do_vfs_lock(filp, fl);
unlock_kernel();