Re: [RFT/PATCH v2 3/6] x86-64: Don't generate cmov in vread_tsc

From: Ingo Molnar
Date: Thu Apr 07 2011 - 03:55:14 EST



* Andy Lutomirski <luto@xxxxxxx> wrote:

> vread_tsc checks whether rdtsc returns something less than
> cycle_last, which is an extremely predictable branch. GCC likes
> to generate a cmov anyway, which is several cycles slower than
> a predicted branch. This saves a couple of nanoseconds.
>
> Signed-off-by: Andy Lutomirski <luto@xxxxxxx>
> ---
> arch/x86/kernel/tsc.c | 19 +++++++++++++++----
> 1 files changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 858c084..69ff619 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -794,14 +794,25 @@ static cycle_t __vsyscall_fn vread_tsc(void)
> */
>
> /*
> - * This doesn't multiply 'zero' by anything, which *should*
> - * generate nicer code, except that gcc cleverly embeds the
> - * dereference into the cmp and the cmovae. Oh, well.
> + * This doesn't multiply 'zero' by anything, which generates
> + * very slightly nicer code than multiplying it by 8.
> */
> last = *( (cycle_t *)
> ((char *)&VVAR(vsyscall_gtod_data).clock.cycle_last + zero) );
>
> - return ret >= last ? ret : last;
> + if (likely(ret >= last))
> + return ret;
> +
> + /*
> + * GCC likes to generate cmov here, but this branch is extremely
> + * predictable (it's just a funciton of time and the likely is
> + * very likely) and there's a data dependence, so force GCC
> + * to generate a branch instead. I don't barrier() because
> + * we don't actually need a barrier, and if this function
> + * ever gets inlined it will generate worse code.
> + */
> + asm volatile ("");

Hm, you have not addressed the review feedback i gave in:

Message-ID: <20110329061546.GA27398@xxxxxxx>

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/