Re: PATCH: Race in 2.6.0-test2 timer code

From: Andrea Arcangeli (andrea@suse.de)
Date: Wed Jul 30 2003 - 17:17:17 EST


On Thu, Jul 31, 2003 at 12:06:04AM +0200, Andrea Arcangeli wrote:
> practice, but still we must be missing something about this code.

ah, finally I see how can the timer->lock can have made the kernel
stable again!

run_all_timers can still definitely run on x86 too if the local cpu
timer runs on top of an irq handler:

        if (in_interrupt())
                goto out_mark;
out_mark:
        mark_bh(TIMER_BH);

        init_bh(TIMER_BH, run_all_timers);

(still on ppc will be an order of magnitude less stable than on x86,
since ppc only calls run_all_timers, so you don't need two races to
trigger at the same time to crash)

ok, so my current code is the right one and it's not needed in 2.6 since
2.6 always runs inside a softirq and it never fallbacks to a
run_all_timers.

So the best fix would be to nuke the run_all_timers thing from 2.4 too.

For now I'm only going to take the risk adding the BUG_ON in mod_timer
and to keep the timer->lock everywhere to make run_all_timers safe.

Now you should only make sure that your 2.4.21 gets stable too with the
fix I applied today (and please add the BUG_ON(old_base != timer->base)
in mod_timer too so I won't be the only tester of that code ;)

In short the stack traces I described today were all right but only for
2.4, and not for 2.6.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jul 31 2003 - 22:00:47 EST