Re: 2.4.0test1-ac14: smp deadlock

From: Andrew Morton (andrewm@uow.edu.au)
Date: Mon Jun 12 2000 - 05:30:50 EST


David Woodhouse wrote:
>
> andrewm@uow.edu.au said:
> > Unfortunately schedule_timeout() doesn't actually call del_timer_sync.
> >
>
> Er,... it did in -ac14. It looked 'obviously correct' to me. So I sent the
> patch to Alan with with a question mark indicating that I wanted him to
> comment rather than just apply it. In hindsight, perhaps I should have made
> that more explicit.

linux-kernel@vger.rutgers.edu :-)

> Why _doesn't_ it work, though?

It needs a timer_exit(&timer) at the end of the handler:

static void process_timeout(unsigned long __data)
{
        struct task_struct * p = (struct task_struct *) __data;

        wake_up_process(p);
+ timer_exit(&p->process_timeout_timer);
}

The timer_exit() call tells del_timer_sync() that the timer handler has
done all its work, and that del_timer_sync() may now stop spinning.
It's pretty 'orrible...

I think this is a worthwhile change. With the code as it stands there's
a good chance that the handler will call wake_up_process() on a
currently-running process, or even on one which has just gone back to
sleep and didn't want to wake up a few microseconds later. Has this
been observed in the wild??

So the patch should:

- add 'struct timer_list process_timeout_timer;' to struct task_struct
- Arrange for 'process_timeout_timer' to be initialised correctly
  via INIT_TASK() and probably in sys_clone() somewhere via init_timer()
- Alter process_timeout() and schedule_timeout() appropriately.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jun 15 2000 - 21:00:25 EST