Re: [clocksource] 8901ecc231: stress-ng.lockbus.ops_per_sec -9.5% regression

From: Paul E. McKenney
Date: Thu Aug 05 2021 - 11:33:12 EST


On Wed, Aug 04, 2021 at 09:34:13PM -0700, Andi Kleen wrote:
>
> > My current thought is that if more than (say) 100 consecutive attempts
> > to read the clocksource get hit with excessive delays, it is time to at
> > least do a WARN_ON(), and maybe also time to disable the clocksource
> > due to skew. The reason is that if reading the clocksource -always-
> > sees excessive delays, perhaps the clock driver or hardware is to blame.
> >
> > Thoughts?
>
> On TDX this would be fatal because we don't have a usable fallback source
>
> (just jiffies). Better try as hard as possible.

At some point, won't the system's suffering in silence become quite the
disservice to its users?

One alternative would be to give a warning splat, but avoid reporting
skew. Unless there is the traditional 62.5ms of skew, of course.

Thanx, Paul