Re: [RFC] possible erronous use of tick_usec in do_gettimeofday

From: George Anzinger
Date: Tue Nov 25 2003 - 18:27:13 EST


Joe Korty wrote:
On Tue, Nov 25, 2003 at 11:57:55AM -0800, George Anzinger wrote:

Joe Korty wrote:

test10's version of do_gettimeofday is using tick_usec which is
defined in terms of USER_HZ not HZ.

We still have the problem that we are doing this calculation in usecs while the wall clock uses nsecs. This would be fine if there were an even number of usecs in tick_nsec, but in fact it is somewhat less than (USEC_PER_SEC / HZ). This means that this correction (if we are behind by 7 or more ticks) will push the clock past current time. Here are the numbers:

tick_nsec =999849 or 1ms less 151 ns. So if we are behind 7 or more ticks we will report the time out 1 us too high. (7 * 151 = 1057 or 1.057 usec).

Question is, do we care? Will we ever be 7ms late in updating the wall clock? As I recall, the wall clock is updated in the interrupt handler for the tick so, to be this late, we would need to suffer a long interrupt hold off AND the tick recovery code would need to have done its thing. But this whole time is covered by a write_seqlock on xtime_lock, so how can this even happen? Seems like it is only possible when we are locked and we then throw the whole thing away.

A test I would like to see is to put this in the code AFTER the read unlock:

if (lost )
printk("Lost is %d\n", lost);

(need to pull " unsigned long lost;" out of the do{}while loop to do this)

In short, I think we are beating a dead issue.


There are other issues too: the 'lost' calculation is a prediction
over the next 'lost' number of ticks. That prediction will be wrong
if 1) adjtime goes to zero within that interval or, 2) adjtime was
zero but went nonzero in that interval due to a adjtimex(2) call.

Despite these flaws the patch replaces truly broken code with code
that is good but slightly inaccurate, which is good enough for now.

Can you prove that "lost" is EVER non-zero in a case we care about? I.e. a case where the read_seq will exit the loop?

I could be wrong here, but I don't think it can happen. That is why I suggested the if(lost) test.

--
George Anzinger george@xxxxxxxxxx
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/