Re: [PATCH] crypto/ccp: Fix locking on alloc failure handling

From: Tom Lendacky
Date: Tue Jun 24 2025 - 11:35:26 EST


On 6/23/25 20:41, Alexey Kardashevskiy wrote:
> On 21/6/25 05:20, Tom Lendacky wrote:
>> On 6/17/25 04:43, Alexey Kardashevskiy wrote:
>>> The __snp_alloc_firmware_pages() helper allocates pages in the firmware
>>> state (alloc + rmpupdate). In case of failed rmpupdate, it tries
>>> reclaiming pages with already changed state. This requires calling
>>> the PSP firmware and since there is sev_cmd_mutex to guard such calls,
>>> the helper takes a "locked" parameter so specify if the lock needs to
>>> be held.
>>>
>>> Most calls happen from snp_alloc_firmware_page() which executes without
>>> the lock. However
>>>
>>> commit 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation
>>> when SNP is enabled")
>>>
>>> switched sev_fw_alloc() from alloc_pages() (which does not call the
>>> PSP) to
>>> __snp_alloc_firmware_pages() (which does) but did not account for the fact
>>> that sev_fw_alloc() is called from __sev_platform_init_locked()
>>> (via __sev_platform_init_handle_tmr()) and executes with the lock held.
>>>
>>> Add a "locked" parameter to __snp_alloc_firmware_pages().
>>> Make sev_fw_alloc() use the new parameter to prevent potential deadlock in
>>> rmp_mark_pages_firmware() if rmpupdate() failed.
>>
>> Would it make sense to add the locked parameter to sev_fw_alloc(), too?
>
> That would be another patch then, this one is a fix ;)
>
> and I'd probably just ditch both snp_alloc_firmware_page() and
> sev_fw_alloc(), rename __snp_alloc_firmware_pages() to
> snp_alloc_firmware_page() and just use this one everywhere. Nobody needs
> page struct anyway, and the locking will be clear everywhere. Also do the
> same for snp_free_firmware_page().
>
> It is just that snp_alloc_firmware_page() and snp_free_firmware_page() are
> EXPORT_SYMBOL_GPL,
>
>> Right now there is only one caller of sev_fw_alloc(), but in the future,
>> if some other path should call sev_fw_alloc() and that path doesn't have
>> the lock, then we'll miss taking it.
>
> I'd rather just ditch sev_fw_alloc(), does not look very useful. Thanks,

Ok, I'm good with this patch for the fix. Can you also provide a follow up
patch(es) to address what you've discussed?

For the fix:
Reviewed-by: Tom Lendacky <thomas.lendacky@xxxxxxx>

Thanks,
Tom

>
>
>
>> Thanks,
>> Tom
>>
>>>
>>> Fixes: 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation
>>> when SNP is enabled")
>>> Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxx>
>>> ---
>>>   drivers/crypto/ccp/sev-dev.c | 8 ++++----
>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
>>> index 3451bada884e..16a11d5efe46 100644
>>> --- a/drivers/crypto/ccp/sev-dev.c
>>> +++ b/drivers/crypto/ccp/sev-dev.c
>>> @@ -434,7 +434,7 @@ static int rmp_mark_pages_firmware(unsigned long
>>> paddr, unsigned int npages, boo
>>>       return rc;
>>>   }
>>>   -static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int
>>> order)
>>> +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int
>>> order, bool locked)
>>>   {
>>>       unsigned long npages = 1ul << order, paddr;
>>>       struct sev_device *sev;
>>> @@ -453,7 +453,7 @@ static struct page
>>> *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order)
>>>           return page;
>>>         paddr = __pa((unsigned long)page_address(page));
>>> -    if (rmp_mark_pages_firmware(paddr, npages, false))
>>> +    if (rmp_mark_pages_firmware(paddr, npages, locked))
>>>           return NULL;
>>>         return page;
>>> @@ -463,7 +463,7 @@ void *snp_alloc_firmware_page(gfp_t gfp_mask)
>>>   {
>>>       struct page *page;
>>>   -    page = __snp_alloc_firmware_pages(gfp_mask, 0);
>>> +    page = __snp_alloc_firmware_pages(gfp_mask, 0, false);
>>>         return page ? page_address(page) : NULL;
>>>   }
>>> @@ -498,7 +498,7 @@ static void *sev_fw_alloc(unsigned long len)
>>>   {
>>>       struct page *page;
>>>   -    page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len));
>>> +    page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len), true);
>>>       if (!page)
>>>           return NULL;
>>>  
>