Re: [PATCH 1/1] tpm: disable hwrng for fTPM on some AMD designs

From: Thorsten Leemhuis
Date: Mon Feb 27 2023 - 05:57:24 EST


Hi, this is your Linux kernel regression tracker. Top-posting for once,
to make this easily accessible to everyone.

Jarkko (or James), what is needed to get this regression resolved? More
people showed up that are apparently affected by this. Sure, 6.2 is out,
but it's a regression in 6.1 it thus would be good to fix rather sooner
than later. Ideally this week, if you ask me.

FWIW, latest version of this patch is here, but it didn't get any
replies since it was posted last Tuesday (and the mail quoted below is
just one day younger):

https://lore.kernel.org/all/20230220180729.23862-1-mario.limonciello@xxxxxxx/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 22.02.23 00:10, Limonciello, Mario wrote:
> [Public]
>
>> -----Original Message-----
>> From: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
>> Sent: Tuesday, February 21, 2023 16:53
>> To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>
>> Cc: Thorsten Leemhuis <regressions@xxxxxxxxxxxxx>; James Bottomley
>> <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>; Jason@xxxxxxxxx; linux-
>> integrity@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
>> stable@xxxxxxxxxxxxxxx; Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>;
>> Linux kernel regressions list <regressions@xxxxxxxxxxxxxxx>
>> Subject: Re: [PATCH 1/1] tpm: disable hwrng for fTPM on some AMD designs
>>
>> On Fri, Feb 17, 2023 at 08:25:56PM -0600, Limonciello, Mario wrote:
>>> On 2/17/2023 16:05, Jarkko Sakkinen wrote:
>>>
>>>> Perhaps tpm_amd_* ?
>>>
>>> When Jason first proposed this patch I feel the intent was it could cover
>>> multiple deficiencies.
>>> But as this is the only one for now, sure re-naming it is fine.
>>>
>>>>
>>>> Also, just a question: is there any legit use for fTPM's, which are not
>>>> updated? I.e. why would want tpm_crb to initialize with a dysfunctional
>>>> firmware?>
>>>> I.e. the existential question is: is it better to workaround the issue and
>>>> let pass through, or make the user aware that the firmware would really
>>>> need an update.
>>>>
>>>
>>> On 2/17/2023 16:35, Jarkko Sakkinen wrote:
>>>>>
>>>>> Hmm, no reply since Mario posted this.
>>>>>
>>>>> Jarkko, James, what's your stance on this? Does the patch look fine
>> from
>>>>> your point of view? And does the situation justify merging this on the
>>>>> last minute for 6.2? Or should we merge it early for 6.3 and then
>>>>> backport to stable?
>>>>>
>>>>> Ciao, Thorsten
>>>>
>>>> As I stated in earlier response: do we want to forbid tpm_crb in this case
>>>> or do we want to pass-through with a faulty firmware?
>>>>
>>>> Not weighting either choice here I just don't see any motivating points
>>>> in the commit message to pick either, that's all.
>>>>
>>>> BR, Jarkko
>>>
>>> Even if you're not using RNG functionality you can still do plenty of other
>>> things with the TPM. The RNG functionality is what tripped up this issue
>>> though. All of these issues were only raised because the kernel started
>>> using it by default for RNG and userspace wants random numbers all the
>> time.
>>>
>>> If the firmware was easily updatable from all the OEMs I would lean on
>>> trying to encourage people to update. But alas this has been available for
>>> over a year and a sizable number of OEMs haven't distributed a fix.
>>>
>>> The major issue I see with forbidding tpm_crb is that users may have been
>>> using the fTPM for something and taking it away in an update could lead to
>> a
>>> no-boot scenario if they're (for example) tying a policy to PCR values and
>>> can no longer access those.
>>>
>>> If the consensus were to go that direction instead I would want to see a
>>> module parameter that lets users turn on the fTPM even knowing this
>> problem
>>> exists so they could recover. That all seems pretty expensive to me for
>>> this problem.
>>
>> I agree with the last argument.
>
> FYI, I did send out a v2 and folded in this argument to the commit message
> and adjusted for your feedback. You might not have found it in your inbox
> yet.
>
>>
>> I re-read the commit message and
>> https://www.amd.com/en/support/kb/faq/pa-410.
>>
>> Why this scopes down to only rng? Should TPM2_CC_GET_RANDOM also
>> blocked
>> from /dev/tpm0?
>>
>
> The only reason that this commit was created is because the kernel utilized
> the fTPM for hwrng which triggered the problem. If that never happened
> this probably wouldn't have been exposed either.
>
> Yes; I would agree that if someone was to do other fTPM operations over
> an extended period of time it's plausible they can cause the problem too.
>
> But picking and choosing functionality to block seems quite arbitrary to me.
>