Re: Failure to release lock after CPU hot-unplug canceled

From: Benjamin Gilbert
Date: Tue Jan 09 2007 - 11:35:33 EST


Heiko Carstens wrote:
On Tue, Jan 09, 2007 at 05:57:40PM +0530, Srivatsa Vaddagiri wrote:
On Tue, Jan 09, 2007 at 01:17:38PM +0100, Heiko Carstens wrote:
The workqueue code grabs a lock on CPU_[UP|DOWN]_PREPARE and releases it
again on CPU_DOWN_FAILED/CPU_UP_CANCELED. If something in the callchain
returns NOTIFY_BAD the rest of the entries in the callchain won't be
called anymore. But DOWN_FAILED/UP_CANCELED will be called for every
entry.
So we might even end up with a mutex_unlock(&workqueue_mutex) even if
mutex_lock(&workqueue_mutex) hasn't been called...
>>
This is a known problem. Gautham had sent out patches to address them

http://lkml.org/lkml/2006/11/14/93

Looks like they are in latest mm tree. Perhaps the testcase should be
retried against latest mm.
>
Ah, nice! Wasn't aware of that. But I still think we should have a
CPU_DOWN_FAILED in case CPU_DOWN_PREPARED failed.
Also the slab cache code hasn't been changed to make use of the of the
new CPU_LOCK_[ACQUIRE|RELEASE] stuff. I'm going to send patches in reply
to this mail.

2.6.20-rc3-mm1 plus your patches fixes it for me.

Thanks
--Benjamin Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/