nfs_dentry_iput() BUG and fix?

Claus-Justus Heine (claus@momo.math.rwth-aachen.de)
20 Jan 1998 11:13:26 +0100


Hi,

while trying to adopt my NFS-swap patches to 2.1.79 I stumbled over
the following symptons. I tried to manipulate some file with a
vi-clone (elvis). When trying to save nfs_dentry_iput() would be
activated and the system got in an infinite loop, producing

nfs_dentry_iput: flushing /tmp/elv<something>

messages.

After some struggle to understand what was going on I managed to
figure out what happens.

nfs_dentry_iput() calling
nfs_flush_dirty_pages() calling
nfs_flush_pages()

After that, nfs_flush_dirty_pages() should wait on the pending
write-back request, but only when nfs_flush_pages() doesn't return
NULL. Code fragment from nfs_flush_dirty_pages():

/* Flush all pending writes for the pid and file region */
last = nfs_flush_pages(inode, pid, offset, len, 0);
if (last == NULL)
break;
wait_on_write_request(last);

Now, nfs_flush_pages() may return NULL in two cases:

a) no write request is pending. Fine.

b) the write-back request is already in progress. BAD.

b) means that whenever the write back request already has been
triggered the system might be caught in a loop. This happens because
the nfs-write-back request might sleep while waiting for the
nfs-server to answer. But as nfs_flush_pages() returns NULL in this
case the function nfs_flush_dirty_pages() not wait and return
immediately to its caller nfs_dentry_iput().

Now, nfs_dentry_iput() simply checks out that the write-back request
is still pending and calls again nfs_flush_dirty_pages(). And so on.

Don't know whether this also happens with an SMP box. In that case the
rpciod daemon probably will eventually get some CPU time on the other
processor and complete the write back request. But on an uni-processor
system this just won't work.

The following patch maybe fixes the problem. It changes
nfs_flush_pages() to also return the write-back requests to its caller
if the write back request already is in progress. Maybe this has side
effects?

Cheers

Claus

diff -u --recursive --new-file linux-2.1.79/fs/nfs/write.c linux/fs/nfs/write.c
--- linux-2.1.79/fs/nfs/write.c Tue Jan 13 12:15:29 1998
+++ linux/fs/nfs/write.c Tue Jan 20 10:47:30 1998
@@ -641,21 +641,19 @@
req->wb_task.tk_pid,
req->wb_inode->i_dev, req->wb_inode->i_ino,
req->wb_page->offset, req->wb_flags);
- if (!WB_INPROGRESS(req)) {
- rqoffset = req->wb_page->offset + req->wb_offset;
- rqend = rqoffset + req->wb_bytes;
-
- if (rqoffset < end && offset < rqend
+ rqoffset = req->wb_page->offset + req->wb_offset;
+ rqend = rqoffset + req->wb_bytes;
+
+ if (rqoffset < end && offset < rqend
&& (pid == 0 || req->wb_pid == pid)) {
- if (!WB_HAVELOCK(req)) {
+ if (!WB_INPROGRESS(req) && !WB_HAVELOCK(req)) {
#ifdef NFS_DEBUG_VERBOSE
printk("nfs_flush: flushing inode=%ld, %d @ %lu\n",
req->wb_inode->i_ino, req->wb_bytes, rqoffset);
#endif
- nfs_flush_request(req);
- }
- last = req;
+ nfs_flush_request(req);
}
+ last = req;
}
if (invalidate)
req->wb_flags |= NFS_WRITE_INVALIDATE;