Re: [ltt-dev] [PATCH] Poll : introduce poll_wait_exclusive() newfunction

From: Mathieu Desnoyers
Date: Wed Nov 26 2008 - 06:15:28 EST


* Davide Libenzi (davidel@xxxxxxxxxxxxxxx) wrote:
> On Tue, 25 Nov 2008, KOSAKI Motohiro wrote:
>
> >
> > patch againt: tip/tracing/marker
> >
> > ==========
> > Currently, wake_up() function behavior depend on the way of
> > wait queue adding function.
> >
> >
> > wake_up() wake_up_all()
> > ---------------------------------------------------------------
> > add_wait_queue() wake up all wake up all
> > add_wait_queue_exclusive() wake up one task wake up all
> >
> >
> > Unforunately, poll_wait() always use add_wait_queue().
> > it means there is no way that wake up only one process in polled processes.
> > wake_up() also wake up all sleeping processes, not 1 process.
> >
> >
> > Mathieu Desnoyers explained it cause following problem to LTTng.
> >
> > In LTTng, all lttd readers are polling all the available debugfs files
> > for data. This is principally because the number of reader threads is
> > user-defined and there are typical workloads where a single CPU is
> > producing most of the tracing data and all other CPUs are idle,
> > available to consume data. It therefore makes sense not to tie those
> > threads to specific buffers. However, when the number of threads grows,
> > we face a "thundering herd" problem where many threads can be woken up
> > and put back to sleep, leaving only a single thread doing useful work.
>
> Why do you need to have so many threads banging a single device/file?
> Have one (or any other very little number) puller thread(s), that
> activates with chucks of pulled data the other processing threads. That
> way there's no need for a new wakeup abstraction.
>
>
>
> - Davide

One of the key design rule of LTTng is to do not depend on such
system-wide data structures, or entity (e.g. single manager thread).
Everything is per-cpu, and it does scale very well.

I wonder how badly the approach you propose can scale on large NUMA
systems, where having to synchronize everything through a single thread
might become an important point of contention, just due to the cacheline
bouncing and extra scheduler activity involved.

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/