Re: [PATCH v3 1/2] mm,hwpoison: fix race with compound page allocation

From: Oscar Salvador
Date: Wed May 12 2021 - 04:33:40 EST


On Wed, May 12, 2021 at 12:10:15AM +0900, Naoya Horiguchi wrote:
> @@ -1095,30 +1095,43 @@ static int __get_hwpoison_page(struct page *page)
> {
> struct page *head = compound_head(page);
>
> - if (!PageHuge(head) && PageTransHuge(head)) {
> - /*
> - * Non anonymous thp exists only in allocation/free time. We
> - * can't handle such a case correctly, so let's give it up.
> - * This should be better than triggering BUG_ON when kernel
> - * tries to touch the "partially handled" page.
> - */
> - if (!PageAnon(head)) {
> - pr_err("Memory failure: %#lx: non anonymous thp\n",
> - page_to_pfn(page));
> - return 0;
> + if (PageCompound(page)) {
> + if (PageSlab(page)) {
> + return get_page_unless_zero(page);
> + } else if (PageHuge(head)) {
> + int ret = 0;
> +
> + spin_lock(&hugetlb_lock);
> + if (!PageHuge(head))
> + ret = -EBUSY;
> + else if (HPageFreed(head) || HPageMigratable(head))
> + ret = get_page_unless_zero(head);
> + spin_unlock(&hugetlb_lock);
> + return ret;

Uhm, I am having a hard time with that -EBUSY.
At this stage, we expect __get_hwpoison_page() to either return true or false,
depending on whether it could grab a page's refcount or not. Returning -EBUSY
here seems wrong (plus it is inconsistent with the comment above the function).
It might be useful for the latter patch, I do not know as I yet have to check
that one, but if anything, let us stay consistent here in this one.
So, if hugetlb vanished under us, let us return "we could not grab the
refcount". Does it make sense?

--
Oscar Salvador
SUSE L3