Re: [PATCH mmotm] memcg: further prevent OOM with too many dirtypages

From: Hugh Dickins
Date: Tue Jul 17 2012 - 00:53:32 EST


On Mon, 16 Jul 2012, Michal Hocko wrote:
> On Mon 16-07-12 01:35:34, Hugh Dickins wrote:
> > But even so, the test still OOMs sometimes: when originally testing
> > on 3.5-rc6, it OOMed about one time in five or ten; when testing
> > just now on 3.5-rc6-mm1, it OOMed on the first iteration.
> >
> > This residual problem comes from an accumulation of pages under
> > ordinary writeback, not marked PageReclaim, so rightly not causing
> > the memcg check to wait on their writeback: these too can prevent
> > shrink_page_list() from freeing any pages, so many times that memcg
> > reclaim fails and OOMs.
>
> I guess you managed to trigger this with 20M limit, right?

That's right.

> I have tested
> with different group sizes but the writeback didn't trigger for most of
> them and all the dirty data were flushed from the reclaim.

I didn't examine writeback stats to confirm, but I guess that just
occasionally it managed to come in and do enough work to confound us.

> Have you used any special setting the dirty ratio?

No, I wasn't imaginative enough to try that.

> Or was it with xfs (IIUC that one
> does ignore writeback from the direct reclaim completely).

No, just ext4 at that point.

I have since tested the final patch with ext4, ext3 (by ext3 driver
and by ext4 driver), ext2 (by ext2 driver and by ext4 driver), xfs,
btrfs, vfat, tmpfs (with swap on the USB stick) and block device:
about an hour on each, no surprises, all okay.

But I didn't experiment beyond the 20M memcg.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/