Re: PATCH: POSIX 1003.1b timer minor fixes

Robert H. de Vries (R.de.Vries@fokkerspace.nl)
Thu, 29 Jul 1999 11:06:49 +0200 (CEST)


On Thu, 29 Jul 1999, Albert D. Cahalan wrote:

>
> Robert de Vries writes:
>
> > Let me illustrate with a small example of how I see the implementation in
> > the C library of clock_gettime().
> ...
> > int clock_gettime(clockid_t which_clock, struct timespec *current_time)
> > {
> > switch (which_clock) {
> > case CLOCK_TSC:
> > return tsc_gettime(current_time);
> > default:
> > return sys_clock_gettime(which_clock,
> > current_time);
> > }
> > }
>
>
> I suppose nobody wants to hear this right now, but...
>
> It appears that you expect the kernel to accept a pointer to a timespec
> struct. Why? You can't sanely work with data in that format.

Why not? There is a precedent in the form of struct timeval for
gettimeofday().

To illustrate a bit more what should be in the kernel or not I have taken
the liberty to do a bit of reverse engineering (or stealing code). I have
written a small program to see how IRIX 6.5 does it. Next I have run it
and traced the signal calls with par. (par -o par.out clock).

clock.c:

#include <stdio.h>
#include <unistd.h>
#include <time.h>

int main(void)
{
struct timespec ts, res;

clock_getres(CLOCK_REALTIME, &res);
printf("REALTIME: %d.%09d\n", res.tv_sec, res.tv_nsec);
clock_getres(CLOCK_SGI_CYCLE, &res);
printf("SGI_CYCLE: %d.%09d\n", res.tv_sec, res.tv_nsec);
clock_getres(CLOCK_SGI_FAST, &res);
printf("SGI_FAST: %d.%09d\n", res.tv_sec, res.tv_nsec);

clock_gettime(CLOCK_REALTIME, &ts);
printf("REALTIME: %d.%09d\n", ts.tv_sec, ts.tv_nsec);
clock_gettime(CLOCK_SGI_CYCLE, &ts);
printf("SGI_CYCLE: %d.%09d\n", ts.tv_sec, ts.tv_nsec);
clock_gettime(CLOCK_SGI_CYCLE, &ts);
printf("SGI_CYCLE: %d.%09d\n", ts.tv_sec, ts.tv_nsec);

clock_settime(CLOCK_REALTIME, &ts);
printf("REALTIME: %d.%09d\n", ts.tv_sec, ts.tv_nsec);
clock_settime(CLOCK_SGI_CYCLE, &ts);
printf("SGI_CYCLE: %d.%09d\n", ts.tv_sec, ts.tv_nsec);
return 0;
}

par.out interleaved with the functions calls from above:

clock_getres(CLOCK_REALTIME, &res);
printf("REALTIME: %d.%09d\n", res.tv_sec, res.tv_nsec);

13mS[ 0] : ioctl(1, __OLD_TCGETA, <bad indirect __old_termio param - len 30>) OK
13mS[ 0] : write(1, "REALTIME: 0.010000000\n", 22) = 22

clock_getres(CLOCK_SGI_CYCLE, &res);

14mS[ 0] : syssgi(SGI_QUERY_CYCLECNTR, 0x7fff2ea0)Px

printf("SGI_CYCLE: %d.%09d\n", res.tv_sec, res.tv_nsec);

14mS[ 0] : write(1, "SGI_CYCLE: 0.000000800\n", 23) = 23

clock_getres(CLOCK_SGI_FAST, &res);

14mS[ 0] : syssgi(SGI_QUERY_FASTTIMER, 0x7fff2ed0, 0x3, 0, 0xffffffff, 0xfb503c8) = 800

printf("SGI_FAST: %d.%09d\n", res.tv_sec, res.tv_nsec);

14mS[ 0] : write(1, "SGI_FAST: 0.000000800\n", 22) = 22

clock_gettime(CLOCK_REALTIME, &ts);

14mS[ 0] : gettimeofday({sec=933236104, usec=503357}) OK

printf("REALTIME: %d.%09d\n", ts.tv_sec, ts.tv_nsec);

14mS[ 0] : write(1, "REALTIME: 933236104.503357000\n", 30) = 30

clock_gettime(CLOCK_SGI_CYCLE, &ts);

14mS[ 0] : getpagesize() = 16384
14mS[ 0] : brk(0x1001c000) OK
14mS[ 0] : syssgi(SGI_CYCLECNTR_SIZE, 0x11, 0x100150c8, 0x100150b8, 0x100150b8, 0x18) = 64
14mS[ 0] : getpagesize() = 16384
14mS[ 0] : syssgi(SGI_QUERY_CYCLECNTR, 0x7fff2e20)Px
14mS[ 0] : open("/dev/mmem", O_RDONLY, 025566540000) = 4
14mS[ 0] : mmap(0, 16383, PROT_READ, MAP_PRIVATE, 4, 2916794368) = 0x4000000
14mS[ 0] : close(4) OK

printf("SGI_CYCLE: %d.%09d\n", ts.tv_sec, ts.tv_nsec);

14mS[ 0] : write(1, "SGI_CYCLE: 779061.448209600\n", 28) = 28

clock_gettime(CLOCK_SGI_CYCLE, &ts);
printf("SGI_CYCLE: %d.%09d\n", ts.tv_sec, ts.tv_nsec);

14mS[ 0] : write(1, "SGI_CYCLE: 779061.448300800\n", 28) = 28

clock_settime(CLOCK_REALTIME, &ts);

14mS[ 0] : settimeofday(0x7fff2ea0) errno = 1 (Operation not permitted)

printf("REALTIME: %d.%09d\n", ts.tv_sec, ts.tv_nsec);

14mS[ 0] : write(1, "REALTIME: 779061.448300800\n", 27) = 27

clock_settime(CLOCK_SGI_CYCLE, &ts);
printf("SGI_CYCLE: %d.%09d\n", ts.tv_sec, ts.tv_nsec);

14mS[ 0] : write(1, "SGI_CYCLE: 779061.448300800\n", 28) = 28

As can be seen is clock_getres implemented using system calls.
clock_gettime is implemented for CLOCK_REALTIME by calling gettimeofday().
For CLOCK_SGI_CYCLE the counter is on the first call mapped to memory and
for all subsequent calls no system calls are required.
clock_settime is implemented for CLOCK_REALTIME by calling settimeofday().
For CLOCK_SGI_CYCLE you cannot set the time, and clock_settime() returns
-1.

So the conclusion could be that of the clock_* functions only
clock_getres() needs to be a system call, and that the others can be
implemented in the C library by using [gs]ettimeofday().
However if someone is really interested in the nanosecond part of struct
timespec, then the clock_gettime() function should be a system call. The
overhead of a system call is currently in the order of one micro sec, so
measuring with a precision much better than that is misleading. But then
again, processors are getting faster all the time, and I'm sure that
system call overhead will be below 1 micro and then it might make sense.

I was experimenting with the TSC and it works really great! The only thing
I need now is a system call telling me what the resolution of the things
is in picoseconds and I'm happy to implement a timer with real nano second
resolution. As far as I know the TSC is incremented on every clock tick,
so that resolution should not be so hard.

This little program tells you how many cycles went by between two reads of
the TSC.

#include <stdio.h>
#include <asm/msr.h>

int main(void)
{
long long p1, p2, p3, p4;

rdtscll(p1);
rdtscll(p2);
rdtscll(p3);
rdtscll(p4);

printf("p1: %Ld\n", p1);
printf("p2: %Ld [%Ld]\n", p2, p2-p1);
printf("p3: %Ld [%Ld]\n", p3, p3-p2);
printf("p4: %Ld [%Ld]\n", p4, p4-p3);

return 0;
}

The output on my Pentium II 450 MHz is:

p1: 61897741731009
p2: 61897741731042 [33]
p3: 61897741731074 [32]
p4: 61897741731107 [33]

A small calculation learns that it takes approx. 70 nsec for one read.
Assembly code for the first rdtscll is:

rdtsc

movl %eax,-8(%ebp)
movl %edx,-4(%ebp)

The second one is compiled to:

rdtsc

movl %eax,%ebx
movl %edx,%esi

which maybe explains why the second delta is 1 less.

BTW. clock_getres for this clock would be 2 ns.
(more exactly 2.222222.... ns). As you can see the clock_getres function
must not be implemented with the POSIX API in the kernel or else you may
not be able to do accurate computations with it.

Robert

--
Robert H. de Vries
PO/SIM
Fokker Space B.V.
e-mail: R.de.Vries@fokkerspace.nl
   tel: (+31)71-5245464
   fax: (+31)71-5245498

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/