Re: [PATCH 1/4] kvm: cpuid: adjust the returned nent field of kvm_cpuid2 for KVM_GET_SUPPORTED_CPUID and KVM_GET_EMULATED_CPUID

From: Vitaly Kuznetsov
Date: Wed Mar 31 2021 - 07:26:19 EST


Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx> writes:

> On 31/03/2021 09:56, Vitaly Kuznetsov wrote:
>> Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx> writes:
>>
>>> On 31/03/2021 05:01, Sean Christopherson wrote:
>>>> On Tue, Mar 30, 2021, Emanuele Giuseppe Esposito wrote:
>>>>> Calling the kvm KVM_GET_[SUPPORTED/EMULATED]_CPUID ioctl requires
>>>>> a nent field inside the kvm_cpuid2 struct to be big enough to contain
>>>>> all entries that will be set by kvm.
>>>>> Therefore if the nent field is too high, kvm will adjust it to the
>>>>> right value. If too low, -E2BIG is returned.
>>>>>
>>>>> However, when filling the entries do_cpuid_func() requires an
>>>>> additional entry, so if the right nent is known in advance,
>>>>> giving the exact number of entries won't work because it has to be increased
>>>>> by one.
>>>>>
>>>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx>
>>>>> ---
>>>>> arch/x86/kvm/cpuid.c | 6 ++++++
>>>>> 1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>>>> index 6bd2f8b830e4..5412b48b9103 100644
>>>>> --- a/arch/x86/kvm/cpuid.c
>>>>> +++ b/arch/x86/kvm/cpuid.c
>>>>> @@ -975,6 +975,12 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
>>>>>
>>>>> if (cpuid->nent < 1)
>>>>> return -E2BIG;
>>>>> +
>>>>> + /* if there are X entries, we need to allocate at least X+1
>>>>> + * entries but return the actual number of entries
>>>>> + */
>>>>> + cpuid->nent++;
>>>>
>>>> I don't see how this can be correct.
>>>>
>>>> If this bonus entry really is needed, then won't that be reflected in array.nent?
>>>> I.e won't KVM overrun the userspace buffer?
>>>>
>>>> If it's not reflected in array.nent, that would imply there's an off-by-one check
>>>> somewhere, or KVM is creating an entry that it doesn't copy to userspace. The
>>>> former seems unlikely as there are literally only two checks against maxnent,
>>>> and they both look correct (famous last words...).
>>>>
>>>> KVM does decrement array->nent in one specific case (CPUID.0xD.2..64), i.e. a
>>>> false positive is theoretically possible, but that carries a WARN and requires a
>>>> kernel or CPU bug as well. And fudging nent for that case would still break
>>>> normal use cases due to the overrun problem.
>>>>
>>>> What am I missing?
>>>
>>> (Maybe I should have put this series as RFC)
>>>
>>> The problem I see and noticed while doing the KVM_GET_EMULATED_CPUID
>>> selftest is the following: assume there are 3 kvm emulated entries, and
>>> the user sets cpuid->nent = 3. This should work because kvm sets 3
>>> array->entries[], and copies them to user space.
>>>
>>> However, when the 3rd entry is populated inside kvm (array->entries[2]),
>>> array->nent is increased once more (do_host_cpuid and
>>> __do_cpuid_func_emulated). At that point, the loop in
>>> kvm_dev_ioctl_get_cpuid and get_cpuid_func can potentially iterate once
>>> more, going into the
>>>
>>> if (array->nent >= array->maxnent)
>>> return -E2BIG;
>>>
>>> in __do_cpuid_func_emulated and do_host_cpuid, returning the error. I
>>> agree that we need that check there because the following code tries to
>>> access the array entry at array->nent index, but from what I understand
>>> that access can be potentially useless because it might just jump to the
>>> default entry in the switch statement and not set the entry, leaving
>>> array->nent to 3.
>>
>> The problem seems to be exclusive to __do_cpuid_func_emulated(),
>> do_host_cpuid() always does
>>
>> entry = &array->entries[array->nent++];
>>
>> Something like (completely untested and stupid):
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 6bd2f8b830e4..54dcabd3abec 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -565,14 +565,22 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
>> return entry;
>> }
>>
>> +static bool cpuid_func_emulated(u32 func)
>> +{
>> + return (func == 0) || (func == 1) || (func == 7);
>> +}
>> +
>> static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
>> {
>> struct kvm_cpuid_entry2 *entry;
>>
>> + if (!cpuid_func_emulated())
>> + return 0;
>> +
>> if (array->nent >= array->maxnent)
>> return -E2BIG;
>>
>> - entry = &array->entries[array->nent];
>> + entry = &array->entries[array->nent++];
>> entry->function = func;
>> entry->index = 0;
>> entry->flags = 0;
>> @@ -580,18 +588,14 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
>> switch (func) {
>> case 0:
>> entry->eax = 7;
>> - ++array->nent;
>> break;
>> case 1:
>> entry->ecx = F(MOVBE);
>> - ++array->nent;
>> break;
>> case 7:
>> entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
>> entry->eax = 0;
>> entry->ecx = F(RDPID);
>> - ++array->nent;
>> - default:
>> break;
>> }
>>
>> should do the job, right?
>>
>>
>
> Yes, it would work better. Alternatively:
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index ba7437308d28..452b0acd6e9d 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -567,34 +567,37 @@ static struct kvm_cpuid_entry2
> *do_host_cpuid(struct kvm_cpuid_array *array,
>
> static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32
> func)
> {
> - struct kvm_cpuid_entry2 *entry;
> -
> - if (array->nent >= array->maxnent)
> - return -E2BIG;
> + struct kvm_cpuid_entry2 entry;
> + bool changed = true;
>
> - entry = &array->entries[array->nent];
> - entry->function = func;
> - entry->index = 0;
> - entry->flags = 0;
> + entry.function = func;
> + entry.index = 0;
> + entry.flags = 0;
>
> switch (func) {
> case 0:
> - entry->eax = 7;
> - ++array->nent;
> + entry.eax = 7;
> break;
> case 1:
> - entry->ecx = F(MOVBE);
> - ++array->nent;
> + entry.ecx = F(MOVBE);
> break;
> case 7:
> - entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> - entry->eax = 0;
> - entry->ecx = F(RDPID);
> - ++array->nent;
> + entry.flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> + entry.eax = 0;
> + entry.ecx = F(RDPID);
> + break;
> default:
> + changed = false;
> break;
> }
>
> + if (changed) {
> + if (array->nent >= array->maxnent)
> + return -E2BIG;
> +
> + memcpy(&array->entries[array->nent++], &entry, sizeof(entry));
> + }
> +
> return 0;
> }
>
> pros: avoids hard-coding another function that would check what the
> switch already does. it will be more flexible if another func has to be
> added. cons: there is a memcpy for each entry.

Looks good to me,

I'd drop just 'bool changed' and replaced it with 'goto out' in the
'default' case.

memcpy() here is not a problem I believe, this path is not that
performace critical.

--
Vitaly