Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

From: Petr Mladek
Date: Thu Jan 18 2018 - 06:51:44 EST


On Wed 2018-01-17 12:05:51, Tejun Heo wrote:
> Hello, Steven.
>
> On Wed, Jan 17, 2018 at 12:12:51PM -0500, Steven Rostedt wrote:
> > From what I gathered, you said an OOM would trigger, and then the
> > network console would not be able to allocate memory and it would
> > trigger a printk too, and cause an infinite amount of printks.
>
> Yeah, it falls into back-and-forth loop between the OOM code and
> netconsole path.
>
> > This could very well be a great place to force offloading. If a printk
> > is called from within a printk, at the same context (normal, softirq,
> > irq or NMI), then we should trigger the offloading.
>
> I was thinking more of a timeout based approach (ie. if stuck for
> longer than X or X messages, offload), but if local feedback loop is
> the only thing we're missing after your improvements, detecting that
> specific condition definitely works and is likely a better approach in
> terms of message delivery guarantee.

I think that we could combine both. The recursion can be detected
rather easily and immediately so there is no reason to wait.

Once we have the code for offloading from recursion then we could
kick_offload_thread() also from other reasons, e.g. when
console_unlock() takes too long.

I think that Sergey is already playing with this. It seems
that we all could be happy in the end.


Best Regards,
Petr

PS: I am sorry for the answer yesterday. Tejun's mail did not mention
any details about the problem. I evidently forgot them. I have OOM
and printk issues associated with Tetsuo. So I messed it. Believe
me. It is a big relief to realize that we are not in the cycle
again.