Re: [RFC][PATCH 0/4] memcg: add support for hwpoison testing

From: Wu Fengguang
Date: Tue Sep 01 2009 - 22:47:32 EST

On Wed, Sep 02, 2009 at 12:31:52AM +0800, Balbir Singh wrote:
> * Wu Fengguang <fengguang.wu@xxxxxxxxx> [2009-09-01 16:55:49]:
> > > My point is that memcg can show 'owner' of pages but the page may
> > > be shared with something important task _and_ if a task is migrated,
> > > its pages' memcg information is not updated now. Then, you can kill
> > > a task which is not in memcg.
> >
> > Ah thanks! I'm not aware of that tricky fact, and it does make a
> > very good reason not to use memcg, although I guess locked page won't
> > be migrated.
> >
> I think what Kamezawa-San is pointing to is that the task can migrate,
> leaving behind the page in the memcg and poisioning those pages can
> kill a task outside the memcg.

Yeah Kame's words reminded me of the memcg goal: it may not have to
track task pages 100% accurately for all the tricky racy windows/cases.
So could be risky to use memcg for hwpoison testing.

Otherwise I felt like using memcg for hwpoison testing because the
exported things are not that bad, and our hwpoison stress testing
efforts may also be very good exercises to some aspects of memcg ;)

Back to the page sharing problem. For hwpoison testing, it is
acceptable for the test program and the init process to share _clean_ pages. Because the hwpoison of such pages can be recovered
gracefully by simply unmap and drop the hwpoisoned ones.

But if two tasks share some dirty pages (eg. shmem), then it could
be killing more tasks than expected. However
- this is a general problem independent the use of memcg
- could be avoided by checking page dirtiness and map count
- our test schemes simply won't try to create such insane conditions
(It will include both tasks as the target.)

btw, hwpoison testing also allows "mis-killing" of no-owner pages (ie.
newly freed pages by the target task in some racy windows) which won't
affect the test correctness.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at