Re: [patch] __volatile__ needed in get_cycles()?

Tigran Aivazian (tigran@sco.COM)
Mon, 29 Mar 1999 13:19:49 +0100 (BST)


Hi Andrea,
On Mon, 29 Mar 1999, Andrea Arcangeli wrote:

> On Mon, 29 Mar 1999, Tigran Aivazian wrote:
>
> >which would enforce "Von Neumann execution stream", e.g. by doing CPUID
>
> What is a Von Neumann execution stream? ;)
P6 architecture (PPro, PII etc.) introduce speculative execution, i.e. if
you for example try to "profile" fdiv by putting a couple of rdtsc before
and after you will be told that fdiv took 0 cycles which is obviously not
true (I wish it was :). This happens because the processor decides that
the second rdtsc is independent from the fdiv and executes it first. So,
one needs to serialize it somehow and the easiest way I know of doing it
is cpuid (but one needs to remember that it clobbers registers).

> Why you need CPUID?
>
> If you need to know some features from the CPU you probably want to use
> the information we grabbed at boot time in the current_cpu_data struct.
It is just for serializing, not to get any cpu features.

>
> About the volatile thing the reason I thought it's not needed is that we
> care that rdtsc is run always at the same offset of code. We care only
> about the _delta_ between the two rdtsc. So basically I seen not using
> __volatile__ as a feature. Comments?
The first "care" you mean "don't care"? So I am not the only human being
who switches "yes/no", "do/don't".... Good.

Btw, __volatile__ in get_cycles() did go into 2.2.5. And the
justification, I hope, is that by doing so it can be used later on for
some other purpose. Putting __volatile__ does not make the current usage
of get_cycles() any worse so why not, if it gives you extra choice?

I personally use it to count the number of cycles it takes for a
particular code path (i.e. without having to enable profiling globally). I
know it is not very accurate (without the usual triple cache warmup
thingy and deducting overheads etc) but it *can* be used to compare one
instruction sequence with another.

There is a Intel's paper on this on developer.intel.com called something
like "Using RDTSC for Performance Monitoring" (not exactly, can't remember
the exact name).

Regards,
Tigran.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/