Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

From: Andi Kleen
Date: Thu Apr 27 2017 - 16:51:34 EST


Michal Hocko <mhocko@xxxxxxxxxx> writes:

> On Tue 25-04-17 16:27:51, Laurent Dufour wrote:
>> When page are poisoned, they should be uncharged from the root memory
>> cgroup.
>>
>> This is required to avoid a BUG raised when the page is onlined back:
>> BUG: Bad page state in process mem-on-off-test pfn:7ae3b
>> page:f000000001eb8ec0 count:0 mapcount:0 mapping: (null)
>> index:0x1
>> flags: 0x3ffff800200000(hwpoison)
>
> My knowledge of memory poisoning is very rudimentary but aren't those
> pages supposed to leak and never come back? In other words isn't the
> hoplug code broken because it should leave them alone?

Yes that would be the right interpretation. If it was really offlined
due to a hardware error the memory will be poisoned and any access
could cause a machine check.

hwpoison has an own "unpoison" option (only used for debugging), which
I think handles this.

-Andi