Re: [PATCH] [RFC] Potential fix for leapsecond caused futexrelated load spikes

From: Richard Cochran
Date: Tue Jul 03 2012 - 05:23:32 EST


On Mon, Jul 02, 2012 at 10:08:21PM +0200, Sytse Wielinga wrote:
> Hi Richard,
>
> On Mon, Jul 02, 2012 at 12:16:08PM +0200, Richard Cochran wrote:
> > I know you didn't like my (originally Michael Hack's) idea of keeping
> > time in TAI, but wouldn't changing to an internal, continuous time
> > scale (not necessary TAI) solve these sorts of timer issues?
>
> Doesn't that actually make the problem of leap seconds worse, as you'd have to
> start tabulating past leap seconds in the kernel?

No, I am not suggesting to do that.

> Even worse, *future* leap seconds would need to be tracked and after they've
> happened stored on disk, and loaded back into the kernel after booting, which
> seems like a mess. The trouble here is that leap seconds are only announced a
> short while before they happen, so there's no way to bake leap seconds into
> the software; they need to be dynamically added by ntpd.
>
> Or is there somehow some way to avoid that?

I think the established practice of announcing the event by network is
the only sane way of handling this issue. The list of TAI-UTC offsets
belongs to what David Mills has called our "institutional memory", and
this is a user space issue. The kernel's job is to just live in the
moment and provide the right time for *now*.

> > There have been a number of clock/timer/leap bugs over the last
> > years. Some of these might have been avoided by using a continuous
> > scale, since no special timer actions would be needed during a leap
> > second.
> >
> > The run time cost is low, just one additional test and addition when
> > reading the time. It might be worth it for the peace of mind when
> > the next leap second rolls around.
>
> I don't know if reworking the system that's been in place for ages is a good
> way to give us 'peace of mind'. Then again, I love to be enlightened :-)

There have been lockups and other kernel issues due to leap second
bugs. That is a fact. Does that give you peace of mind?

My own computers were off for the last leap second. But some people
cannot afford to do this. I suggest that changing the code so that no
special actions occur at a leap second would be more reliable than
having rarely tested code paths just for leap second handling.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/