Re: RT and Cascade interrupts

From: Oleg Nesterov
Date: Wed Jun 01 2005 - 09:23:41 EST


del_timer_sync() is known to be racy. Now I think it has
(theoretically) problems with singleshot timers too.

Suppose whe have rpc timer which is running on CPU_0, it is
preempted *after* it has cleared RPC_TASK_HAS_TIMER bit.

Then __rpc_execute's main loop on CPU_1 calls rpc_delete_timer(),
it returns without doing del_timer_sync().

Now it calls ->tk_action()->rpc_sleep_on()->__rpc_add_timer(),
timer pending on CPU_1.

preemption comes, reschedule at another (not CPU_1) processor,
(this step is not strictly necessary).

Next loop iteration, __rpc_execute calls rpc_delete_timer() again,
now it calls del_timer_sync().

del_timer_sync:

// __run_timers starts on CPU_1, picks rpc timer, sets
// timer->base = 0

ret += del_timer(timer); // return 0, timer is not pending.

for_each_online_cpu() {
if (timer running on that cpu) {
// finds the timer still running on CPU_0,
// waits for __run_timers on CPU_0 change
// ->running_timer,
// the timer on CPU_0 completes.
break;
}
}

if (timer_pending()) // NO
goto del_again;

// The timer still running on CPU_1
return;

This all is very unlikely of course, but it would be nice to verify
that kernel/timer.c is not the source of the problem.

John, if it is easy to reproduce the problem, could you please retest
with this patch?

Oleg.

--- 2.6.12-rc5/net/sunrpc/sched.c~ Wed Jun 1 17:49:57 2005
+++ 2.6.12-rc5/net/sunrpc/sched.c Wed Jun 1 18:00:31 2005
@@ -137,8 +137,12 @@ rpc_delete_timer(struct rpc_task *task)
{
if (RPC_IS_QUEUED(task))
return;
- if (test_and_clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate)) {
- del_singleshot_timer_sync(&task->tk_timer);
+ if (test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate)) {
+ if (del_singleshot_timer_sync(&task->tk_timer)) {
+ BUG_ON(!test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate));
+ clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate);
+ } else
+ BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate));
dprintk("RPC: %4d deleting timer\n", task->tk_pid);
}
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/