Re: [PATCH] x86/hyper-v: use cheaper HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST, SPACE} hypercalls when possible

From: Thomas Gleixner
Date: Tue Jun 19 2018 - 09:04:44 EST


On Tue, 19 Jun 2018, Vitaly Kuznetsov wrote:
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:
>
> > On Fri, 15 Jun 2018, Vitaly Kuznetsov wrote:
> >> * Fills in gva_list starting from offset. Returns the number of items added.
> >> @@ -93,10 +95,19 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus,
> >> if (cpumask_equal(cpus, cpu_present_mask)) {
> >> flush->flags |= HV_FLUSH_ALL_PROCESSORS;
> >> } else {
> >> + /*
> >> + * It is highly likely that VP ids are in ascending order
> >> + * matching Linux CPU ids; Check VP index for the highest CPU
> >> + * in the supplied set to see if EX hypercall is required.
> >> + * This is just a best guess but should work most of the time.
> >
> > TLB flushing based on 'best guess' and 'should work most of the time' is
> > not a brilliant approach.
> >
>
> Oh no no no, that's not what I meant :-)
>
> We have the following problem: from the supplied CPU set we need to
> figure out if we can get away with HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,
> SPACE} hypercalls which are cheaper or if we need to use more expensing
> HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST, SPACE}_EX ones. The dividing line is
> the highest VP_INDEX of the supplied CPU set: in case it is < 64 cheaper
> hypercalls are OK. Now how do we check that? In the patch I have the
> following approach:
> 1) Check VP number for the highest CPU in the supplied set. In case it
> is > 64 we for sure need more expensive hypercalls. This is the "guess"
> which works most of the time because Linux CPU ids usually match
> VP_INDEXes.
>
> 2) In case the answer to the previous question was negative we start
> preparing input for the cheaper hypercall. However, if while walking the
> CPU set we meet a CPU with VP_INDEX higher than 64 we'll discard the
> prepared input and switch to the more expensive hypercall.
>
> Said that the 'guess' here is just an optimization to avoid walking the
> whole CPU set when we find the required answer quickly by looking at the
> highest bit. This will help big systems with hundreds of CPUs.

Care to fix the comment to avoid the offending words?

Thanks,

tglx