Re: 2.6.14-rt22 (and mainline) excessive latency

From: Paul E. McKenney
Date: Wed Dec 21 2005 - 08:36:12 EST


On Tue, Dec 20, 2005 at 10:32:48PM -0500, Lee Revell wrote:
> On Tue, 2005-12-20 at 17:47 -0800, Paul E. McKenney wrote:
> > On Tue, Dec 20, 2005 at 05:24:42AM +0100, Ingo Molnar wrote:
> > >
> > > * Lee Revell <rlrevell@xxxxxxxxxxx> wrote:
> > >
> > > > I captured this 3+ ms latency trace when killing a process with a few
> > > > thousand threads. Can a cond_resched be added to this code path?
> > >
> > > > bash-17992 0.n.1 29us : eligible_child (do_wait)
> > > >
> > > > [ 3000+ of these deleted ]
> > > >
> > > > bash-17992 0.n.1 3296us : eligible_child (do_wait)
> > >
> > > Atomicity of signal delivery is pretty much a must, so i'm not sure this
> > > particular latency can be fixed, short of running PREEMPT_RT. Paul E.
> > > McKenney is doing some excellent stuff by RCU-ifying the task lookup and
> > > signal code, but i'm not sure whether it could cover do_wait().
> >
> > Took a quick break from repeatedly shooting myself in the foot with
> > RCU read-side priority boosting (still have a few toes left) to take
> > a quick look at this. The TASK_TRACED and TASK_STOPPED cases seem
> > non-trivial, and I am concerned about races with exit.
> >
> > Any thoughts on whether the latency is due to contention on the
> > tasklist lock vs. the "goto repeat" in do_wait()?
>
> It's a UP system so I'd be surprised if there were any contention.

Couldn't there be contention due to preemption of someone holding
the tasklist lock?

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/