Re: [PATCH v10 2/7] powerpc/mce: Fix MCE handling for huge pages

From: Nicholas Piggin
Date: Mon Aug 19 2019 - 23:04:29 EST


Santosh Sivaraj's on August 20, 2019 11:47 am:
> Hi Nick,
>
> Nicholas Piggin <npiggin@xxxxxxxxx> writes:
>
>> Santosh Sivaraj's on August 15, 2019 10:39 am:
>>> From: Balbir Singh <bsingharora@xxxxxxxxx>
>>>
>>> The current code would fail on huge pages addresses, since the shift would
>>> be incorrect. Use the correct page shift value returned by
>>> __find_linux_pte() to get the correct physical address. The code is more
>>> generic and can handle both regular and compound pages.
>>>
>>> Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors")
>>> Signed-off-by: Balbir Singh <bsingharora@xxxxxxxxx>
>>> [arbab@xxxxxxxxxxxxx: Fixup pseries_do_memory_failure()]
>>> Signed-off-by: Reza Arbab <arbab@xxxxxxxxxxxxx>
>>> Co-developed-by: Santosh Sivaraj <santosh@xxxxxxxxxx>
>>> Signed-off-by: Santosh Sivaraj <santosh@xxxxxxxxxx>
>>> Tested-by: Mahesh Salgaonkar <mahesh@xxxxxxxxxxxxxxxxxx>
>>> Cc: stable@xxxxxxxxxxxxxxx # v4.15+
>>> ---
>>> arch/powerpc/include/asm/mce.h | 2 +-
>>> arch/powerpc/kernel/mce_power.c | 55 ++++++++++++++--------------
>>> arch/powerpc/platforms/pseries/ras.c | 9 ++---
>>> 3 files changed, 32 insertions(+), 34 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
>>> index a4c6a74ad2fb..f3a6036b6bc0 100644
>>> --- a/arch/powerpc/include/asm/mce.h
>>> +++ b/arch/powerpc/include/asm/mce.h
>>> @@ -209,7 +209,7 @@ extern void release_mce_event(void);
>>> extern void machine_check_queue_event(void);
>>> extern void machine_check_print_event_info(struct machine_check_event *evt,
>>> bool user_mode, bool in_guest);
>>> -unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
>>> +unsigned long addr_to_phys(struct pt_regs *regs, unsigned long addr);
>>> #ifdef CONFIG_PPC_BOOK3S_64
>>> void flush_and_reload_slb(void);
>>> #endif /* CONFIG_PPC_BOOK3S_64 */
>>> diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
>>> index a814d2dfb5b0..e74816f045f8 100644
>>> --- a/arch/powerpc/kernel/mce_power.c
>>> +++ b/arch/powerpc/kernel/mce_power.c
>>> @@ -20,13 +20,14 @@
>>> #include <asm/exception-64s.h>
>>>
>>> /*
>>> - * Convert an address related to an mm to a PFN. NOTE: we are in real
>>> - * mode, we could potentially race with page table updates.
>>> + * Convert an address related to an mm to a physical address.
>>> + * NOTE: we are in real mode, we could potentially race with page table updates.
>>> */
>>> -unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
>>> +unsigned long addr_to_phys(struct pt_regs *regs, unsigned long addr)
>>> {
>>> - pte_t *ptep;
>>> - unsigned long flags;
>>> + pte_t *ptep, pte;
>>> + unsigned int shift;
>>> + unsigned long flags, phys_addr;
>>> struct mm_struct *mm;
>>>
>>> if (user_mode(regs))
>>> @@ -35,14 +36,21 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
>>> mm = &init_mm;
>>>
>>> local_irq_save(flags);
>>> - if (mm == current->mm)
>>> - ptep = find_current_mm_pte(mm->pgd, addr, NULL, NULL);
>>> - else
>>> - ptep = find_init_mm_pte(addr, NULL);
>>> + ptep = __find_linux_pte(mm->pgd, addr, NULL, &shift);
>>> local_irq_restore(flags);
>>> +
>>> if (!ptep || pte_special(*ptep))
>>> return ULONG_MAX;
>>> - return pte_pfn(*ptep);
>>> +
>>> + pte = *ptep;
>>> + if (shift > PAGE_SHIFT) {
>>> + unsigned long rpnmask = (1ul << shift) - PAGE_SIZE;
>>> +
>>> + pte = __pte(pte_val(pte) | (addr & rpnmask));
>>> + }
>>> + phys_addr = pte_pfn(pte) << PAGE_SHIFT;
>>> +
>>> + return phys_addr;
>>> }
>>
>> This should remain addr_to_pfn I think. None of the callers care what
>> size page the EA was mapped with. 'pfn' is referring to the Linux pfn,
>> which is the small page number.
>>
>> if (shift > PAGE_SHIFT)
>> return (pte_pfn(*ptep) | ((addr & ((1UL << shift) - 1)) >> PAGE_SHIFT);
>> else
>> return pte_pfn(*ptep);
>>
>> Something roughly like that, then you don't have to change any callers
>> or am I missing something?
>
> Here[1] you asked to return the real address rather than pfn, which all
> callers care about. So made the changes accordingly.
>
> [1] https://www.spinics.net/lists/kernel/msg3187658.html

Ah I did suggest it, but I meant _exact_ physical address :) The one
matching the effective address you gave it.

As it is now, the physical address is truncated at the small page size,
so if you do that you might as well just keep it as a pfn and no change
to callers.

I would also prefer getting the pfn as above rather than constructing a
new pte, which is a neat hack but is not a normal pattern.

Thanks,
Nick