Hi,
OK, I finally spent some time studying the timer code and am starting
to grasp it. I think I finally understand the bug. Andrea's timer->lock
patch will fix it for 2.4 and everybody says it can't happen in 2.6 so
ok.
On Thu, Jul 31, 2003 at 12:17:17AM +0200, Andrea Arcangeli wrote:
> So the best fix would be to nuke the run_all_timers thing from 2.4 too.
Yes.
>and to keep the timer->lock everywhere to make run_all_timers safe.
Or do that instead, yes.
> In short the stack traces I described today were all right but only for
> 2.4, and not for 2.6.
I see the bug in 2.4.
On Wed, Jul 30, 2003 at 12:31:23PM +0200, Ingo Molnar wrote:
>
> On Wed, 30 Jul 2003, Andrea Arcangeli wrote:
>
> >
> > cpu0 cpu1
> > ------------ --------------------
> >
> > do_setitimer
> > it_real_fn
> > del_timer_sync add_timer -> crash
>
> would you mind to elaborate the precise race? I cannot see how the above
> sequence could crash on current 2.6:
I don't know enough about how 2.6 works to say,
but here's more detail on what happened in 2.4:
cpu 0 cpu 3
------- ---------
previously, timer->base = cpu 3 base
(its task-struct->real_timer)
bh_action() {
__run_timers() { sys_setitimer() {
it_real_fn() // called as fn() del_timer_sync() {
add_timer() { spinlock (cpu3 base)
spinlock (cpu0 base)
timer->base = cpu0 base detach_timer (timer)
list_add(&timer->list)
timer->list.next = NULL;
and now timer is vectored in but can't be unlinked.
And so either of Andrea's fixes should fix this race.
--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Thu Jul 31 2003 - 22:00:48 EST