Re: [PATCH RFC v7 62/64] x86/sev: Add KVM commands for instance certs

From: Dov Murik
Date: Wed Jan 11 2023 - 01:01:46 EST




On 10/01/2023 17:10, Tom Lendacky wrote:
> On 1/10/23 01:10, Dov Murik wrote:
>> Hi Tom,
>>
>> On 10/01/2023 0:27, Tom Lendacky wrote:
>>> On 1/9/23 10:55, Dionna Amalie Glaze wrote:
>>>>>> +
>>>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct
>>>>>> kvm_sev_cmd *argp)
>>>>>> +{
>>>>> [...]
>>>>>
>>>>> Here we set the length to the page-aligned value, but we copy only
>>>>> params.cert_len bytes.  If there are two subsequent
>>>>> snp_set_instance_certs() calls where the second one has a shorter
>>>>> length, we might "keep" some leftover bytes from the first call.
>>>>>
>>>>> Consider:
>>>>> 1. snp_set_instance_certs(certs_addr point to "AAA...",
>>>>> certs_len=8192)
>>>>> 2. snp_set_instance_certs(certs_addr point to "BBB...",
>>>>> certs_len=4097)
>>>>>
>>>>> If I understand correctly, on the second call we'll copy 4097 "BBB..."
>>>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
>>>>> 1) & PAGE_MASK which will be 8192.
>>>>>
>>>>> Later when fetching the certs (for the extended report or in
>>>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
>>>>> filled with 4097 BBBs and 4095 leftover AAAs.
>>>>>
>>>>> Maybe zero sev->snp_certs_data entirely before writing to it?
>>>>>
>>>>
>>>> Yes, I agree it should be zeroed, at least if the previous length is
>>>> greater than the new length. Good catch.
>>>>
>>>>
>>>>> Related question (not only for this patch) regarding snp_certs_data
>>>>> (host or per-instance): why is its size page-aligned at all? why is it
>>>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this
>>>>> buffer
>>>>> is never sent to the PSP.
>>>>>
>>>>
>>>> The buffer is meant to be copied into the guest driver following the
>>>> GHCB extended guest request protocol. The data to copy back are
>>>> expected to be in 4K page granularity.
>>>
>>> I don't think the data has to be in 4K page granularity. Why do you
>>> think it does?
>>>
>>
>> I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication
>> Block Standardization (July 2022), page 37.  The table says:
>>
>> --------------
>>
>> NAE Event: SNP Extended Guest Request
>>
>> Notes:
>>
>> RAX will have the guest physical address of the page(s) to hold returned
>> data
>>
>> RBX
>> State to Hypervisor: will contain the number of guest contiguous
>> pages supplied to hold returned data
>> State from Hypervisor: on error will contain the number of guest
>> contiguous pages required to hold the data to be returned
>>
>> ...
>>
>> The request page, response page and data page(s) must be assigned to the
>> hypervisor (shared).
>>
>> --------------
>>
>>
>> According to this spec, it looks like the sizes are communicated as
>> number of pages in RBX.  So the data should start at a 4KB alignment
>> (this is verified in snp_handle_ext_guest_request()) and its length
>> should be 4KB-aligned, as Dionna noted.
>
> That only indicates how many pages are required to hold the data, but
> the hypervisor only has to copy however much data is present. If the
> data is 20 bytes, then you only have to copy 20 bytes. If the user
> supplied 0 for the number of pages, then the code returns 1 in RBX to
> indicate that one page is required to hold the 20 bytes.
>


Maybe it should only copy 20 bytes, but current implementation copies
whole 4KB pages:


if (sev->snp_certs_len)
data_npages = sev->snp_certs_len >> PAGE_SHIFT;
...
...
/* Copy the certificate blob in the guest memory */
if (data_npages &&
kvm_write_guest(kvm, data_gpa, sev->snp_certs_data, data_npages << PAGE_SHIFT))
rc = SEV_RET_INVALID_ADDRESS;


(elsewhere we ensure that sev->snp_certs_len is page-aligned, so the assignment
to data_npages is in fact correct even though looks off-by-one; aside, maybe it's
better to use some DIV_ROUND_UP macro anywhere we calculate the number of
needed pages.)

Also -- how does the guest know they got only 20 bytes and not 4096? Do they have
to read all the 'struct cert_table' entries at the beginning of the received data?

-Dov


>>
>> I see no reason (in the spec and in the kernel code) for the data length
>> to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some
>> flow because Dionna ran into this limit.
>
> Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way
> to keep the memory usage controlled because data is coming from
> userspace and it isn't expected that the data would be larger than that.
>
> I'm not sure if that was in from the start or as a result of a review
> comment. Not sure what is the best approach is.
>
> Thanks,
> Tom
>
>>
>>
>> -Dov
>>
>>
>>
>>> Thanks,
>>> Tom
>>>
>>>>
>>>>> [...]
>>>>>>
>>>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000  /* 16KB */
>>>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000  /* 20KB */
>>>>>>
>>>>>
>>>>> This has effects in drivers/crypto/ccp/sev-dev.c
>>>>>                                                                  (for
>>>>> example in alloc_snp_host_map).  Is that OK?
>>>>>
>>>>
>>>> No, this was a mistake of mine because I was using a bloated data
>>>> encoding that needed 5 pages for the GUID table plus 4 small
>>>> certificates. I've since fixed that in our user space code.
>>>> We shouldn't change this size and instead wait for a better size
>>>> negotiation protocol between the guest and host to avoid this awkward
>>>> hard-coding.
>>>>
>>>>