Re: [PATCH v1] riscv: support arch_has_hw_pte_young()

From: Jessica Clarke
Date: Mon Jan 30 2023 - 12:27:26 EST


On 30 Jan 2023, at 10:49, Andrew Jones <ajones@xxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Jan 30, 2023 at 03:55:55PM +0530, Anup Patel wrote:
>> On Sun, Jan 29, 2023 at 12:21 PM Jinyu Tang <tjytimi@xxxxxxx> wrote:
>>>
>>> The arch_has_hw_pte_young() is false for riscv by default. If it's
>>> false, page table walk is almost skipped for MGLRU reclaim. And it
>>> will also cause useless step in __wp_page_copy_user().
>>>
>>> RISC-V Privileged Book says that riscv have two schemes to manage A
>>> and D bit.
>>>
>>> So add a config for selecting, the default is true. For simple
>>> implementation riscv CPU which just generate page fault, unselect it.
>>
>> I totally disagree with this approach.
>>
>> Almost all existing RISC-V platforms don't have HW support
>> PTE.A and PTE.D updates.
>>
>> We want the same kernel image to run HW with/without PTE.A
>> and PTE.D updates so kconfig based approach is not going to
>> fly.
>>
>>>
>>> Signed-off-by: Jinyu Tang <tjytimi@xxxxxxx>
>>> ---
>>> arch/riscv/Kconfig | 10 ++++++++++
>>> arch/riscv/include/asm/pgtable.h | 7 +++++++
>>> 2 files changed, 17 insertions(+)
>>>
>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>>> index e2b656043abf..17c82885549c 100644
>>> --- a/arch/riscv/Kconfig
>>> +++ b/arch/riscv/Kconfig
>>> @@ -180,6 +180,16 @@ config PAGE_OFFSET
>>> default 0x80000000 if 64BIT && !MMU
>>> default 0xff60000000000000 if 64BIT
>>>
>>> +config ARCH_HAS_HARDWARE_PTE_YOUNG
>>> + bool "Hardware Set PTE Access Bit"
>>> + default y
>>> + help
>>> + Select if hardware set A bit when PTE is accessed. The default is
>>> + 'Y', because most RISC-V CPU hardware can manage A and D bit.
>>> + But RISC-V may have simple implementation that do not support
>>> + hardware set A bit but only generate page fault, for that case just
>>> + unselect it.
>>> +
>>> config KASAN_SHADOW_OFFSET
>>> hex
>>> depends on KASAN_GENERIC
>>> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
>>> index 4eba9a98d0e3..1db54ab4e1ba 100644
>>> --- a/arch/riscv/include/asm/pgtable.h
>>> +++ b/arch/riscv/include/asm/pgtable.h
>>> @@ -532,6 +532,13 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
>>> */
>>> return ptep_test_and_clear_young(vma, address, ptep);
>>> }
>>> +#ifdef CONFIG_ARCH_HAS_HARDWARE_PTE_YOUNG
>>
>>> +#define arch_has_hw_pte_young arch_has_hw_pte_young
>>> +static inline bool arch_has_hw_pte_young(void)
>>> +{
>>> + return true;
>>
>> Drop the kconfig option ARCH_HAS_HARDWARE_PTE_YOUNG
>> and instead use code patching to return true only when Svadu
>> ISA extension is available in DT ISA string.
>
> Indeed. I should have checked if there was an extension for this
> first. It crossed my mind that we should only be enabling features
> when the extensions are present, but looking at the privileged manual
> isn't sufficient to learn about the Svadu extension. I should have
> checked https://wiki.riscv.org/display/HOME/Specification+Status
>
> Anyway, I retract my r-b and agree with Anup.

Svadu is a bit of a mess, for years it’s been legal to implement
hardware A/D tracking and such implementations exist (it’s what QEMU
has done for many years, and I know of an FPGA-based implementation
that does it too), yet RVA20S64 outlaws that by requiring what it calls
Ssptead and Svadu gets introduced to re-allow that behaviour gated
behind a CSR bit.

Jess