Re: [RFC][PATCH] Fix hang in posix_locks_deadlock()

From: George G. Davis
Date: Thu Oct 18 2007 - 14:58:21 EST


On Wed, Oct 17, 2007 at 02:51:57PM -0400, George G. Davis wrote:
> ---
> Not sure if this is the correct fix but it does resolve the hangs we're
> observing in posix_locks_deadlock().

Please disregard the previous patch, it's not quite right - causes occasional
segfaults and clearly did not retain the posix_same_owner() checks implemented
in the original code. Here's a new version which I believe retains the
intent of the original code:

diff --git a/fs/locks.c b/fs/locks.c
index 7f9a3ea..e012b27 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -702,14 +702,12 @@ static int posix_locks_deadlock(struct file_lock *caller_fl,
{
struct file_lock *fl;

-next_task:
if (posix_same_owner(caller_fl, block_fl))
return 1;
list_for_each_entry(fl, &blocked_list, fl_link) {
if (posix_same_owner(fl, block_fl)) {
- fl = fl->fl_next;
- block_fl = fl;
- goto next_task;
+ if (posix_same_owner(caller_fl, fl))
+ return 1;
}
}
return 0;


I'm not sure about those "fl = fl->fl_next; block_fl = fl;" statements,
first, the order of those statements seems reversed to me. Otherwise,
I think the intent was to advance the "fl" for loop variable to the next
entry in the list but it doesn't work out that way at all - the for
loop restarts from the beginning - this is where we get into an
infinite loop condition. Whether the test case I posted before is
valid or not, I reckon it shouldn't be possible for non-root Joe user
to contrive a test case which can hang the system as we're observing
with that test case. The above patch fixes the hang.

Comments greatly appreciated...

--
Regards,
George
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/