Re: [patch V2 00/20] timer: Refactor the timer wheel

From: Thomas Gleixner
Date: Mon Jun 20 2016 - 09:59:01 EST


On Fri, 17 Jun 2016, Eric Dumazet wrote:
> To avoid increasing probability of such events we would need to have
> at least 4 ms difference between the RTO timer and delack timer.
>
> Meaning we have to increase both of them and increase P99 latencies of
> RPC workloads.
>
> Maybe a switch to hrtimer would be less risky.
> But I do not know yet if it is doable without big performance penalty.

That will be a big performance issue. So we have the following choices:

1) Increase the wheel size for HZ=1000. Doable, but utter waste of space and
obviously more pointless work when collecting expired timers.

2) Cut off at 37hrs for HZ=1000. We could make this configurable as a 1000HZ
option so datacenter folks can use this and people who don't care and want
better batching for power can use the 4ms thingy.

3) Split the wheel granularities. That would leave the first wheel with tick
granularity and the next 3 with 12.5% worst case and then for the further
out timers we'd switch to 25%.

Thoughts?

Thanks,

tglx