Re: [patch 3/21] x86, bts: wait until traced task has beenscheduled out

From: Oleg Nesterov
Date: Wed Apr 01 2009 - 15:08:44 EST


On 04/01, Metzger, Markus T wrote:
>
> >-----Original Message-----
> >From: Oleg Nesterov [mailto:oleg@xxxxxxxxxx]
> >Sent: Wednesday, April 01, 2009 2:17 AM
> >To: Metzger, Markus T
>
> >> +static void wait_to_unschedule(struct task_struct *task)
> >> +{
> >> + unsigned long nvcsw;
> >> + unsigned long nivcsw;
> >> +
> >> + if (!task)
> >> + return;
> >> +
> >> + if (task == current)
> >> + return;
> >> +
> >> + nvcsw = task->nvcsw;
> >> + nivcsw = task->nivcsw;
> >> + for (;;) {
> >> + if (!task_is_running(task))
> >> + break;
> >> + /*
> >> + * The switch count is incremented before the actual
> >> + * context switch. We thus wait for two switches to be
> >> + * sure at least one completed.
> >> + */
> >> + if ((task->nvcsw - nvcsw) > 1)
> >> + break;
> >> + if ((task->nivcsw - nivcsw) > 1)
> >> + break;
> >> +
> >> + schedule();
> >
> >schedule() is a nop here. We can wait unpredictably long...
>
> Hmmm, As far as I understand the code, rt-workqueues use a higher sched_class
> and can thus not be preempted by normal threads. Non-rt workqueues
> use the fair_sched_class. And schedule_work() uses a non-rt workqueue.

I was unclear, sorry.

I meant, in this case

while (!CONDITION)
schedule();

is not better compared to

while (!CONDITION)
; /* do nothing */

(OK, schedule() is better without CONFIG_PREEMPT, but this doesn't matter).
wait_to_unschedule() just spins waiting for ->nXvcsw, this is not optimal.

And another problem, we can wait unpredictably long, because

> In practice, task is ptraced. It is either stopped or exiting.
> I don't expect to loop very often.

No. The task _was_ ptraced when we called (say) ptrace_detach(). But when
work->func() runs, the tracee is not traced, it is running (not necessary
of course, the tracer _can_ leave it in TASK_STOPPED).

Now, again, suppose that this task does "for (;;) ;" in user-space.
If CPU is "free", it can spin "forever" without re-scheduling. Yes sure,
this case is not likely in practice, but still.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/