Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()

From: Jan Kara
Date: Thu Sep 05 2013 - 11:46:20 EST


Sorry for a delayed reply. I was on vacation...

On Fri 23-08-13 12:58:22, Andrew Morton wrote:
> On Fri, 23 Aug 2013 21:48:36 +0200 (CEST) Jiri Kosina <jkosina@xxxxxxx> wrote:
>
> > > > We have customers (quite a few of them actually) which have machines with
> > > > lots of SCSI disks attached (due to multipath etc.) and during boot when
> > > > these disks are discovered and partitions set up quite some printing
> > > > happens - multiplied by the number of devices (1000+) it is too much for a
> > > > serial console to handle quickly enough. So these machines aren't able to
> > > > boot with serial console enabled.
> > >
> > > It sounds like rather a corner case, not worth mucking up the critical
> > > core logging code.
> >
> > Andrew, I have to admit I don't understand this argument at all.
>
> Of course you do. print should be simple, robust and have minimum
> dependency on other kernel parts.
>
> I suppose that if you make the proposed
> /proc/sys/kernel/max_printk_chars settable from the boot command line
> and default to zero, any risks are minimized.
That's easy enough to do so if it makes you happy I'll go for that.
During my vacation I was also thinking how I could address some of your
concerns. The only idea I found plausible was a scheme where CPU that
wants to stop printing would raise some flag but still keep printing
releasing and reacquiring the console_sem from time to time. In
console_trylock_for_printk() we would block waiting for console_sem
if we see the flag raised.

This way we would be guaranteed someone has really taken over printing
before we leave console_unlock(). We would still need to use irq_work so
that we have someone to take over printing in case printk storm has filled
our dmesg buffer and we are now slowly getting it out to the console.

So all in all this would be a bit more complex than my current solution
(additional flag and some logic around it). The advantage is that we would
rely on irq_work only to achieve reasonable irq latency but it won't be
necessary for getting printk out to console. If this addresses your
concerns better I could try implementing that. Thoughts?

> Baling out if oops_in_progress was a good thing also.
That's what's happening already now yes.

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/