Re: [PATCH] autofs4 deadlock during expire - kernel 2.6
From: Mike Waychison
Date: Wed Sep 24 2003 - 11:00:16 EST
Ian Kent wrote:
On Wed, 24 Sep 2003, Arjan van de Ven wrote:
On Wed, 2003-09-24 at 15:01, Ian Kent wrote:
This is a corrected patch for the autofs4 daedlock problem I posted about
@@ -206,6 +207,11 @@
interruptible_sleep_on(&wq->queue);
+ if (waitqueue_active(&wq->queue) && current != wq->owner) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout(wq->wait_ctr * (HZ/10));
+ }
+
this really really looks like you're trying to pamper over a bug by
changing the timing somewhere instead of fixing it...
Agreed.
also are you sure the deadlock isn't because of the racey use of
interruptible_sleep_on ?
I think the deadlock itself needs to be properly identified.
Could you explain where the deadlock is actually occuring? I briefed
over the automount 4 code as well as autofs4 and I don't see the
deadlock. The 'owner' in the case of an expiry will be a child process
of the daemon, within a call to ioctl(EXPIRE_MULTI), correct? Having it
be released from the waitqueue first should not affect flow of execution
and released from deadlock.
I don't see how having it wake up before before any other racing
processes solves anything.
I think Arjan is right in that the race is do to the nautilus process
entering the sleep_on after the a call to wake_up(&wq->queue). I don't
know if a change to using a workqueue is best.. how about refactoring
that chunk of code to use wait_event_interruptible on the queue, which
should be clear of any waitqueue/sleep_on races.
OK so maybe I should have suggestions instead of comments.
Please elaborate.
How about you try out this quick patch I threw together.
Mike Waychison
===== waitq.c 1.6 vs edited =====
--- 1.6/fs/autofs4/waitq.c Fri Feb 7 12:25:20 2003
+++ edited/waitq.c Wed Sep 24 15:48:30 2003
@@ -204,7 +204,7 @@
recalc_sigpending();
spin_unlock_irqrestore(¤t->sighand->siglock, irqflags);
- interruptible_sleep_on(&wq->queue);
+ wait_event_interruptible(wq->queue, wq->name == NULL);
spin_lock_irqsave(¤t->sighand->siglock, irqflags);
current->blocked = oldset;