Re: [PATCH] trace: Set oom_score_adj to maximum for ring bufferallocating process

From: Steven Rostedt
Date: Thu May 26 2011 - 19:38:51 EST


[ I added to the Cc people that understand MM more than I do ]

On Thu, 2011-05-26 at 15:28 -0700, Vaibhav Nagarnaik wrote:
> On Thu, May 26, 2011 at 2:00 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > But the issue is, if the process increasing the size of the ring buffer
> > causes the oom, it will not handle the SIGKILL until after the ring
> > buffer has finished allocating. Now, if it failed to allocate, then we
> > are fine, but if it does not fail, but now we start killing processes,
> > then we may be in trouble.
> >
>
> If I understand correctly, if a fatal signal is pending on a process
> while allocation is called, the allocation fails. Then we handle the
> freeing up memory correctly, though the echo gets killed once we return
> from the allocation process.
>
> > I like the NORETRY better. But then, would this mean that if we have a
> > lot of cached filesystems, we wont be able to extend the ring buffer?
>
> It doesn't seem so. I talked with the mm- team and I understand that
> even if NORETRY is set, cached pages will be flushed out and allocation
> will succeed. But it still does not address the situation when the ring
> buffer allocation is going on and another process invokes OOM. If the
> oom_score_adj is not set to maximum, then random processes will still be
> killed before ring buffer allocation fails.
>
> >
> > I'm thinking the oom killer used here got lucky. As it killed this task,
> > we were still out of memory, and the ring buffer failed to get the
> > memory it needed and freed up everything that it previously allocated,
> > and returned. Then the process calling this function would be killed by
> > the OOM. Ideally, the process shouldn't be killed and the ring buffer
> > just returned -ENOMEM to the user.
>
> What do you think of this?
>
> test_set_oom_score_adj(MAXIMUM);
> allocate_ring_buffer(GFP_KERNEL | __GFP_NORETRY);
> test_set_oom_score_adj(original);
>
> This makes sure that the allocation fails much sooner and more
> gracefully. If oom-killer is invoked in any circumstance, then the ring
> buffer allocation process gives up memory and is killed.

I don't know. But as I never seen this function before, I went and took
a look. This test_set_oom_score_adj() is new, and coincidentally written
by another google developer ;)

As there's not really a precedence to this, if those that I added to the
Cc, give their acks, I'm happy to apply this for the next merge window.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/