Re: [patch 00/43] ktimer reworked

From: Ingo Molnar
Date: Thu Dec 01 2005 - 16:19:10 EST



* Andrew Morton <akpm@xxxxxxxx> wrote:

> The whole concept of separating "timers" from "timeouts" seems a step
> backward to me. A large one. Why was it done, and can it be undone?

i had a very similar opinion when i first talked to Thomas about HRES
timers a couple of months ago. I told him that the only sane way to add
HRES timers to the -rt tree is to integrate them into the existing timer
wheel, and to avoid a duality of APIs. I told him that adding a separate
HRES implementation is pretty much a 'non-starter'.

then many months of experimentation followed. We (well, Thomas mostly)
patiently tried a sub-jiffy method, a split-lists method, all sorts of
ways to merge high-res timers into the timer wheel. We got HRES timers
work in every such design, but it looked way too ugly and had bad
performance and latencies - and we wanted upsteam integration.

so after many months we realized that the core issue is that the
requirements of 'timers' are unmixable with the requirements of
'timeouts'. See a more detailed analysis at:

http://lwn.net/Articles/152436/

i'll try to sum it up again very briefly: the timer wheel is a very
well-optimized data structure geared towards 'timeout' type of use. But
it is at the very edge of its 'feature reach', and we found no workable
way to extend it into the directions we wanted to go. The moment we
tried to extend it in one direction (e.g. to increase HZ to 1000000 to
get 1 usec resolution), it started creaking in some other spots.

the only clean solution we found was to totally separate them, and to
use the natural data structures for both of them: to keep the highly
scalable timer wheel on the timeout side, and to use the slower but more
flexible [and deterministic] timer trees on the timer side. [ktimers are
still very fast - but they cannot possibly be as fast as the single
list_add() of add_timer()!]. The two usage scenarios (timeouts and
timers) do not care about each other.

we could merge the two by driving 'timeouts' via ktimers too - but there
would be some unavoidable overhead to things like the TCP stack. But
ktimers cannot be merged into timeouts, that's sure.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/