Re: [RFC PATCH 4/4] x86/TSC: Use RDTSCP

From: Andy Lutomirski
Date: Sat Dec 15 2018 - 13:53:50 EST


On Fri, Dec 14, 2018 at 5:39 AM David Laight <David.Laight@xxxxxxxxxx> wrote:
>
> From: Borislav Petkov
> > Sent: 12 December 2018 18:45
> ...
> > > The property I want for RDTSC ordering is much weaker: I want it to be
> > > ordered like a load. Imagine that, instead of an on-chip TSC, the TSC
> > > is literally a location in main memory that gets incremented by an
> > > extra dedicated CPU every nanosecond or so. I want users of RDTSC to
> > > work as if they were reading such a location in memory using an
> > > ordinary load. I believe this gives the real desired property that it
> > > should be impossible to observe the TSC going backwards. This is a
> > > much weaker form of serialization.
> >
> > Well, in that case you need something new.
> >
> > Because, the moment you have a RDTSC in flight and a second RDTSC comes
> > in and that second RDTSC must *not* bypass the first one and execute
> > earlier due to OoO, you need to impose some ordering. And that's pretty
> > much uarch-dependent, I'd say.
> >
> > And I guess on AMD the way to do that is to stop dispatch until the
> > first RDTSC retires.
> >
> > Can it be done faster? Sure. And I'm pretty sure there's a lot of pesky
> > little hw details we're not even hearing of, which get in the way.
>
> ISTR one of the problems with RDTSC serialising is that it is used
> for micro-benchmarks.

If you're benchmarking with that level of detail, you're probably
doing RDTSC directly instead of using the vDSO. Or, even better,
RDPMC.