Re: [patch 3/21] x86, bts: wait until traced task has beenscheduled out

From: Ingo Molnar
Date: Wed Apr 01 2009 - 07:42:20 EST



* Oleg Nesterov <oleg@xxxxxxxxxx> wrote:

> On 03/31, Markus Metzger wrote:
> >
> > +static void wait_to_unschedule(struct task_struct *task)
> > +{
> > + unsigned long nvcsw;
> > + unsigned long nivcsw;
> > +
> > + if (!task)
> > + return;
> > +
> > + if (task == current)
> > + return;
> > +
> > + nvcsw = task->nvcsw;
> > + nivcsw = task->nivcsw;
> > + for (;;) {
> > + if (!task_is_running(task))
> > + break;
> > + /*
> > + * The switch count is incremented before the actual
> > + * context switch. We thus wait for two switches to be
> > + * sure at least one completed.
> > + */
> > + if ((task->nvcsw - nvcsw) > 1)
> > + break;
> > + if ((task->nivcsw - nivcsw) > 1)
> > + break;
> > +
> > + schedule();
>
> schedule() is a nop here. We can wait unpredictably long...
>
> Ingo, do have have any ideas to improve this helper?

hm, there's a similar looking existing facility:
wait_task_inactive(). Have i missed some subtle detail that makes it
inappropriate for use here?

> Not that I really like it, but how about
>
> int force_unschedule(struct task_struct *p)
> {
> struct rq *rq;
> unsigned long flags;
> int running;
>
> rq = task_rq_lock(p, &flags);
> running = task_running(rq, p);
> task_rq_unlock(rq, &flags);
>
> if (running)
> wake_up_process(rq->migration_thread);
>
> return running;
> }
>
> which should be used instead of task_is_running() ?

Yes - wait_task_inactive() should be switched to a scheme like that
- it would fix bugs like:

53da1d9: fix ptrace slowness

in a cleaner way.

> We can even do something like
>
> void wait_to_unschedule(struct task_struct *task)
> {
> struct migration_req req;
>
> rq = task_rq_lock(p, &task);
> running = task_running(rq, p);
> if (running) {
> // make sure __migrate_task() will do nothing
> req->dest_cpu = NR_CPUS + 1;
> init_completion(&req->done);
> list_add(&req->list, &rq->migration_queue);
> }
> task_rq_unlock(rq, &flags);
>
> if (running) {
> wake_up_process(rq->migration_thread);
> wait_for_completion(&req.done);
> }
> }
>
> This way we don't poll, and we need only one helper.

Looks even better. The migration thread would run complete(), right?

A detail: i suspect this needs to be in a while() loop, for the case
that the victim task raced with us and went to another CPU before we
kicked it off via the migration thread.

This looks very useful to me. It could also be tested easily: revert
53da1d9 and you should see:

time strace dd if=/dev/zero of=/dev/null bs=1024 count=1000000

performance plummet on an SMP box. The with your fix it should go up
to near full speed again.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/