Re: Fw: [PATCH -mm] workqueue: debug possible endless loop in cancel_rearming_delayed_work

From: Oleg Nesterov
Date: Wed Apr 25 2007 - 09:42:09 EST


On 04/25, Jarek Poplawski wrote:
>
> On Wed, Apr 25, 2007 at 02:20:38PM +0200, Jarek Poplawski wrote:
> > 2 cents more...
> ...
> > On Tue, Apr 24, 2007 at 10:55:37PM +0400, Oleg Nesterov wrote:
> > > + do {
> > > + retry = 1;
>
> Of course this'll be shorter:
>
> retry = 0;

No, this would be wrong. Note the comment about CPU-hotplug below,
we should retry if cwq was changed.

> > > + spin_lock_irq(&cwq->lock);
> > > + /* CPU_DEAD in progress may change cwq */
> > > + if (likely(cwq == get_wq_data(work))) {
> > > + list_del_init(&work->entry);
> > > + __set_bit(WORK_STRUCT_PENDING, work_data_bits(work));
> > > + retry = try_to_del_timer_sync(&dwork->timer) < 0;
> > > + }
> > > + spin_unlock_irq(&cwq->lock);
> > > + } while (unlikely(retry));

> 1. If delayed_work_timer_fn of this work is fired and is waiting
> on the above spin_lock then, after above spin_unlock, the work
> will be queued.

No, in that case try_to_del_timer_sync() returns -1.

> Probably this is also possible without timer i.e.
> with queue_work.

Yes, thanks. While adding cpu-hotplug check I forgot to add ->current_work
check, which is needed to actually implement this

> > Note that cancel_rearming_delayed_work() now can handle the works
> > which re-arm itself via queue_work(), not only queue_delayed_work().

part. I'll resend after fix.

> 2. If this function is fired after setting _PENDING flag in
> queue_delayed_work_on, but before add_timer, this
> try_to_del_timer_sync loop would miss this, too.

same as above, thanks.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/