Re: [RFT/PATCH v2 3/6] x86-64: Don't generate cmov in vread_tsc

From: Andrew Lutomirski
Date: Thu Apr 07 2011 - 07:26:03 EST


On Thu, Apr 7, 2011 at 3:54 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * Andy Lutomirski <luto@xxxxxxx> wrote:
>
>> vread_tsc checks whether rdtsc returns something less than
>> cycle_last, which is an extremely predictable branch.  GCC likes
>> to generate a cmov anyway, which is several cycles slower than
>> a predicted branch.  This saves a couple of nanoseconds.
>>
>> Signed-off-by: Andy Lutomirski <luto@xxxxxxx>
>> ---
>>  arch/x86/kernel/tsc.c |   19 +++++++++++++++----
>>  1 files changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
>> index 858c084..69ff619 100644
>> --- a/arch/x86/kernel/tsc.c
>> +++ b/arch/x86/kernel/tsc.c
>> @@ -794,14 +794,25 @@ static cycle_t __vsyscall_fn vread_tsc(void)
>>        */
>>
>>       /*
>> -      * This doesn't multiply 'zero' by anything, which *should*
>> -      * generate nicer code, except that gcc cleverly embeds the
>> -      * dereference into the cmp and the cmovae.  Oh, well.
>> +      * This doesn't multiply 'zero' by anything, which generates
>> +      * very slightly nicer code than multiplying it by 8.
>>        */
>>       last = *( (cycle_t *)
>>                 ((char *)&VVAR(vsyscall_gtod_data).clock.cycle_last + zero) );
>>
>> -     return ret >= last ? ret : last;
>> +     if (likely(ret >= last))
>> +             return ret;
>> +
>> +     /*
>> +      * GCC likes to generate cmov here, but this branch is extremely
>> +      * predictable (it's just a funciton of time and the likely is
>> +      * very likely) and there's a data dependence, so force GCC
>> +      * to generate a branch instead.  I don't barrier() because
>> +      * we don't actually need a barrier, and if this function
>> +      * ever gets inlined it will generate worse code.
>> +      */
>> +     asm volatile ("");
>
> Hm, you have not addressed the review feedback i gave in:
>
>  Message-ID: <20110329061546.GA27398@xxxxxxx>

I can change that, but if anyone ever inlines this function (and Andi
suggested that as another future optimization), then they'd want to
undo it, because it will generate worse code. (barrier() has the
unnecessary memory clobber.)

--Andy

>
> Thanks,
>
>        Ingo
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/