Re: [PATCH v2 33/39] x86/cpufeatures: Limit shadow stack to Intel CPUs

From: John Allen
Date: Thu Nov 03 2022 - 13:39:48 EST


On 10/4/22 6:24 PM, Edgecombe, Rick P wrote:
> On Tue, 2022-10-04 at 14:17 -0700, H. Peter Anvin wrote:
>> On October 4, 2022 1:50:20 PM PDT, Nathan Chancellor <
>> nathan@xxxxxxxxxx> wrote:
>>> On Tue, Oct 04, 2022 at 08:34:54PM +0000, Edgecombe, Rick P wrote:
>>>> On Tue, 2022-10-04 at 14:43 -0500, John Allen wrote:
>>>>> On 10/4/22 10:47 AM, Nathan Chancellor wrote:
>>>>>> Hi Kees,
>>>>>>
>>>>>> On Mon, Oct 03, 2022 at 09:54:26PM -0700, Kees Cook wrote:
>>>>>>> On Mon, Oct 03, 2022 at 05:09:04PM -0700, Dave Hansen
>>>>>>> wrote:
>>>>>>>> On 10/3/22 16:57, Kees Cook wrote:
>>>>>>>>> On Thu, Sep 29, 2022 at 03:29:30PM -0700, Rick
>>>>>>>>> Edgecombe
>>>>>>>>> wrote:
>>>>>>>>>> Shadow stack is supported on newer AMD processors,
>>>>>>>>>> but the
>>>>>>>>>> kernel
>>>>>>>>>> implementation has not been tested on them. Prevent
>>>>>>>>>> basic
>>>>>>>>>> issues from
>>>>>>>>>> showing up for normal users by disabling shadow stack
>>>>>>>>>> on
>>>>>>>>>> all CPUs except
>>>>>>>>>> Intel until it has been tested. At which point the
>>>>>>>>>> limitation should be
>>>>>>>>>> removed.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Rick Edgecombe <
>>>>>>>>>> rick.p.edgecombe@xxxxxxxxx>
>>>>>>>>>
>>>>>>>>> So running the selftests on an AMD system is sufficient
>>>>>>>>> to
>>>>>>>>> drop this
>>>>>>>>> patch?
>>>>>>>>
>>>>>>>> Yes, that's enough.
>>>>>>>>
>>>>>>>> I _thought_ the AMD folks provided some tested-by's at
>>>>>>>> some
>>>>>>>> point in the
>>>>>>>> past. But, maybe I'm confusing this for one of the other
>>>>>>>> shared
>>>>>>>> features. Either way, I'm sure no tested-by's were
>>>>>>>> dropped on
>>>>>>>> purpose.
>>>>>>>>
>>>>>>>> I'm sure Rick is eager to trim down his series and this
>>>>>>>> would
>>>>>>>> be a great
>>>>>>>> patch to drop. Does anyone want to make that easy for
>>>>>>>> Rick?
>>>>>>>>
>>>>>>>> <hint> <hint>
>>>>>>>
>>>>>>> Hey Gustavo, Nathan, or Nick! I know y'all have some fancy
>>>>>>> AMD
>>>>>>> testing
>>>>>>> rigs. Got a moment to spin up this series and run the
>>>>>>> selftests?
>>>>>>> :)
>>>>>>
>>>>>> I do have access to a system with an EPYC 7513, which does
>>>>>> have
>>>>>> Shadow
>>>>>> Stack support (I can see 'shstk' in the "Flags" section of
>>>>>> lscpu
>>>>>> with
>>>>>> this series). As far as I understand it, AMD only added
>>>>>> Shadow
>>>>>> Stack
>>>>>> with Zen 3; my regular AMD test system is Zen 2 (probably
>>>>>> should
>>>>>> look at
>>>>>> procurring a Zen 3 or Zen 4 one at some point).
>>>>>>
>>>>>> I applied this series on top of 6.0 and reverted this change
>>>>>> then
>>>>>> booted
>>>>>> it on that system. After building the selftest (which did
>>>>>> require
>>>>>> 'make headers_install' and a small addition to make it build
>>>>>> beyond
>>>>>> that, see below), I ran it and this was the result. I am not
>>>>>> sure
>>>>>> if
>>>>>> that is expected or not but the other results seem promising
>>>>>> for
>>>>>> dropping this patch.
>>>>>>
>>>>>> $ ./test_shadow_stack_64
>>>>>> [INFO] new_ssp = 7f8a36c9fff8, *new_ssp = 7f8a36ca0001
>>>>>> [INFO] changing ssp from 7f8a374a0ff0 to 7f8a36c9fff8
>>>>>> [INFO] ssp is now 7f8a36ca0000
>>>>>> [OK] Shadow stack pivot
>>>>>> [OK] Shadow stack faults
>>>>>> [INFO] Corrupting shadow stack
>>>>>> [INFO] Generated shadow stack violation successfully
>>>>>> [OK] Shadow stack violation test
>>>>>> [INFO] Gup read -> shstk access success
>>>>>> [INFO] Gup write -> shstk access success
>>>>>> [INFO] Violation from normal write
>>>>>> [INFO] Gup read -> write access success
>>>>>> [INFO] Violation from normal write
>>>>>> [INFO] Gup write -> write access success
>>>>>> [INFO] Cow gup write -> write access success
>>>>>> [OK] Shadow gup test
>>>>>> [INFO] Violation from shstk access
>>>>>> [OK] mprotect() test
>>>>>> [OK] Userfaultfd test
>>>>>> [FAIL] Alt shadow stack test
>>>>>
>>>>> The selftest is looking OK on my system (Dell PowerEdge R6515
>>>>> w/ EPYC
>>>>> 7713). I also just pulled a fresh 6.0 kernel and applied the
>>>>> series
>>>>> including the fix Nathan mentions below.
>>>>>
>>>>> $ tools/testing/selftests/x86/test_shadow_stack_64
>>>>> [INFO] new_ssp = 7f30cccc5ff8, *new_ssp = 7f30cccc6001
>>>>> [INFO] changing ssp from 7f30cd4c6ff0 to 7f30cccc5ff8
>>>>> [INFO] ssp is now 7f30cccc6000
>>>>> [OK] Shadow stack pivot
>>>>> [OK] Shadow stack faults
>>>>> [INFO] Corrupting shadow stack
>>>>> [INFO] Generated shadow stack violation successfully
>>>>> [OK] Shadow stack violation test
>>>>> [INFO] Gup read -> shstk access success
>>>>> [INFO] Gup write -> shstk access success
>>>>> [INFO] Violation from normal write
>>>>> [INFO] Gup read -> write access success
>>>>> [INFO] Violation from normal write
>>>>> [INFO] Gup write -> write access success
>>>>> [INFO] Cow gup write -> write access success
>>>>> [OK] Shadow gup test
>>>>> [INFO] Violation from shstk access
>>>>> [OK] mprotect() test
>>>>> [OK] Userfaultfd test
>>>>> [OK] Alt shadow stack test.
>>>>
>>>> Thanks for the testing. Based on the test, I wonder if this could
>>>> be a
>>>> SW bug. Nathan, could I send you a tweaked test with some more
>>>> debug
>>>> information?
>>>
>>> Yes, more than happy to help you look into this further!
>>>
>>>> John, are we sure AMD and Intel systems behave the same with
>>>> respect to
>>>> CPUs not creating Dirty=1,Write=0 PTEs in rare situations? Or any
>>>> other
>>>> CET related differences we should hash out? Otherwise I'll drop
>>>> the
>>>> patch for the next version. (and assuming the issue Nathan hit
>>>> doesn't
>>>> turn up anything HW related).
>>
>> I have to admit to being a bit confused here... in general, we trust
>> CPUID bits unless they are *known* to be wrong, in which case we
>> blacklist them.
>>
>> If AMD advertises the feature but it doesn't work or they didn't
>> validate it, that would be a (serious!) bug on their part that we can
>> address by blacklisting, but they should also fix with a
>> microcode/BIOS patch.
>>
>> What am I missing?
>
> I have not read anything about the AMD implementation except hearing
> that it is supported. But there are some microarchitectual-like aspects
> to this CET Linux implementation, around requiring CPUs to not create
> Dirty=1,Write=0 PTEs in some cases, where they did in the past. In
> another thread Jann asked how the IOMMU works with respect to this edge
> case and I'm currently trying to chase down that answer for even Intel
> HW. So I just wanted to double check that we expect that everything
> should be the same. In either case we still have time to iron things
> out before anything gets upstream.

Hi Rick,

Sorry for the delayed reply. After asking around, I think you can safely
assume that AMD will not create Dirty=1,Write=0 PTEs in rare
circumstances and shadow stack should behave the same as Intel in that
regard.

Thanks,
John