Re: [mm-unstable PATCH v4 5/9] mm, hwpoison: make unpoison aware of raw error info in hwpoisoned hugepage

From: Miaohe Lin
Date: Wed Jul 06 2022 - 23:08:15 EST


On 2022/7/7 9:35, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Wed, Jul 06, 2022 at 11:06:28PM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
>> On Wed, Jul 06, 2022 at 10:58:53AM +0800, Miaohe Lin wrote:
>>> On 2022/7/4 9:33, Naoya Horiguchi wrote:
>>>> From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
>>>>
>>>> Raw error info list needs to be removed when hwpoisoned hugetlb is
>>>> unpoisoned. And unpoison handler needs to know how many errors there
>>>> are in the target hugepage. So add them.
>>>>
>>>> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
>>>> ---
>>>> @@ -2287,6 +2301,7 @@ int unpoison_memory(unsigned long pfn)
>>>
>>> Is it safe to unpoison hugepage when HPageRawHwpUnreliable? I'm afraid because
>>> some raw error info is missing..
>>
>> Ah, right. We need prevent it. I'll fix it by inserting the check.
>>
>> static inline long free_raw_hwp_pages(struct page *hpage, bool move_flag)
>> {
>> struct llist_head *head;
>> struct llist_node *t, *tnode;
>> long count = 0;
>>
>> + if (!HPageRawHwpUnreliable(hpage))
>> + return 0;

IIUC, even if we return 0 here, the caller will still do TestClearPageHWPoison(please see below
code diff) and succeeds to unpoison the page. Or am I miss something?

@@ -2334,6 +2349,8 @@ int unpoison_memory(unsigned long pfn)

ret = get_hwpoison_page(p, MF_UNPOISON);
if (!ret) {
+ if (PageHuge(p))
+ count = free_raw_hwp_pages(page, false);
ret = TestClearPageHWPoison(page) ? 0 : -EBUSY;
} else if (ret < 0) {
if (ret == -EHWPOISON) {

>
> No, I meant "if (HPageRawHwpUnreliable(hpage))", sorry for the noise :(

No, thanks for your hard work!

>
> - Naoya Horiguchi

Thanks.

>