Re: [PATCH] Add support for deferrable timers (respun)

From: Venki Pallipadi
Date: Tue Mar 27 2007 - 17:56:50 EST


On Wed, Mar 28, 2007 at 01:11:45AM +0400, Oleg Nesterov wrote:
> On 03/27, Venki Pallipadi wrote:
> >
> > for (;;) {
> > - base = timer->base;
> > + tvec_base_t *prelock_base = timer->base;
> > + base = timer_get_base(timer);
> > if (likely(base != NULL)) {
> > spin_lock_irqsave(&base->lock, *flags);
> > - if (likely(base == timer->base))
> > + if (likely(prelock_base == timer->base))
> > return base;
>
> I don't think this is correct, at least in theory.
>
> Suppose that
>
> tvec_base_t *prelock_base = timer->base;
> base = timer_get_base(timer);
>
> are re-ordered (the second LOAD happens after the first one), and the timer
> changes its base in between. Now, we lock the old base, and return it because
> "prelock_base == timer->base" == true.
>

Great catch. Yes. this is a theoritical possibility, even though most compilers
would load base only once and use it for prelock_base and 'and' it for
base. Atleast that is what I see on i386/gcc.

Incremental patch below eliminates this race.

Index: new/kernel/timer.c
===================================================================
--- new.orig/kernel/timer.c 2007-03-26 15:19:35.000000000 -0800
+++ new/kernel/timer.c 2007-03-27 13:00:33.000000000 -0800
@@ -96,9 +96,9 @@
return tbase_get_deferrable(timer->base);
}

-static inline struct tvec_t_base_s *timer_get_base(struct timer_list *timer)
+static inline struct tvec_t_base_s *tbase_get_base(struct tvec_t_base_s *base)
{
- return ((struct tvec_t_base_s *)((unsigned long)(timer->base) &
+ return ((struct tvec_t_base_s *)((unsigned long)base &
~TBASE_DEFERRABLE_FLAG));
}

@@ -368,7 +368,7 @@

for (;;) {
tvec_base_t *prelock_base = timer->base;
- base = timer_get_base(timer);
+ base = tbase_get_base(prelock_base);
if (likely(base != NULL)) {
spin_lock_irqsave(&base->lock, *flags);
if (likely(prelock_base == timer->base))
@@ -592,7 +592,7 @@
* don't have to detach them individually.
*/
list_for_each_entry_safe(timer, tmp, &tv_list, entry) {
- BUG_ON(timer_get_base(timer) != base);
+ BUG_ON(tbase_get_base(timer->base) != base);
internal_add_timer(base, timer);
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/