Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation

From: Andrew Cooper
Date: Fri Dec 02 2016 - 06:44:38 EST


On 02/12/16 00:35, Andy Lutomirski wrote:
> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
> guaranteed to serialize. (Even CPUID isn't *really* guaranteed to
> serialize on Xen PV, but, in practice, any trap it generates will
> serialize.)

Well, Xen will enabled CPUID Faulting wherever it can, which is
realistically all IvyBridge hardware and newer.

All hypercalls are a privilege change to cpl0. I'd hope this condition
is serialising, but I can't actually find any documentation proving or
disproving this.

>
> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
> ~110ns. But Xen PV will trap CPUID if possible, so IRET-to-self
> should end up being a nice speedup.
>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>

CC'ing xen-devel and the Xen maintainers in Linux.

As this is the only email from this series in my inbox, I will say this
here, but it should really be against patch 6.

A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
serialising on the 486, but I don't have a manual to hand to check.

~Andrew

> ---
> arch/x86/xen/enlighten.c | 35 +++++++++++++++++++++++++++++++++++
> 1 file changed, 35 insertions(+)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index bdd855685403..1f765b41eee7 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -311,6 +311,39 @@ static __read_mostly unsigned int cpuid_leaf1_ecx_set_mask;
> static __read_mostly unsigned int cpuid_leaf5_ecx_val;
> static __read_mostly unsigned int cpuid_leaf5_edx_val;
>
> +static void xen_sync_core(void)
> +{
> + register void *__sp asm(_ASM_SP);
> +
> +#ifdef CONFIG_X86_32
> + asm volatile (
> + "pushl %%ss\n\t"
> + "pushl %%esp\n\t"
> + "addl $4, (%%esp)\n\t"
> + "pushfl\n\t"
> + "pushl %%cs\n\t"
> + "pushl $1f\n\t"
> + "iret\n\t"
> + "1:"
> + : "+r" (__sp) : : "cc");
> +#else
> + unsigned long tmp;
> +
> + asm volatile (
> + "movq %%ss, %0\n\t"
> + "pushq %0\n\t"
> + "pushq %%rsp\n\t"
> + "addq $8, (%%rsp)\n\t"
> + "pushfq\n\t"
> + "movq %%cs, %0\n\t"
> + "pushq %0\n\t"
> + "pushq $1f\n\t"
> + "iretq\n\t"
> + "1:"
> + : "=r" (tmp), "+r" (__sp) : : "cc");
> +#endif
> +}
> +
> static void xen_cpuid(unsigned int *ax, unsigned int *bx,
> unsigned int *cx, unsigned int *dx)
> {
> @@ -1289,6 +1322,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
>
> .start_context_switch = paravirt_start_context_switch,
> .end_context_switch = xen_end_context_switch,
> +
> + .sync_core = xen_sync_core,
> };
>
> static void xen_reboot(int reason)