Re: [RFC][PATCH v4 2/2] printk: Skip messages on oops

From: Sergey Senozhatsky
Date: Thu Mar 17 2016 - 06:55:34 EST


Hello Jan,

On (03/14/16 23:13), Sergey Senozhatsky wrote:
>
> From: Jan Kara <jack@xxxxxxx>
>
> When there are too many messages in the kernel printk buffer it can take
> very long to print them to console (especially when using slow serial
> console). This is undesirable during oops so when we encounter oops and
> there are more than 100 messages to print, print just the newest 100
> messages and then the oops message.

I think this patch will introduce a regression, so I'd probably prefer
not to include it now in the series.

the pattern "print something important then panic()" is quite common.
given that other CPUs can printk() a lot before panic_cpu send out
stop_ipi, we can lose the "print something important" part.

...
arch/metag/kernel/cachepart.c: pr_emerg("Potential cache aliasing detected in %s on Thread %d\n",
arch/metag/kernel/cachepart.c- cache_type ? "DCACHE" : "ICACHE", thread_id);
arch/metag/kernel/cachepart.c- pr_warn("Total %s size: %u bytes\n",
arch/metag/kernel/cachepart.c- cache_type ? "DCACHE" : "ICACHE",
arch/metag/kernel/cachepart.c- cache_type ? get_dcache_size()
arch/metag/kernel/cachepart.c- : get_icache_size());
arch/metag/kernel/cachepart.c- pr_warn("Thread %s size: %d bytes\n",
arch/metag/kernel/cachepart.c- cache_type ? "CACHE" : "ICACHE",
arch/metag/kernel/cachepart.c- thread_cache_size);
arch/metag/kernel/cachepart.c- pr_warn("Page Size: %lu bytes\n", PAGE_SIZE);
arch/metag/kernel/cachepart.c- panic("Potential cache aliasing detected");
...
arch/s390/kernel/jump_label.c: pr_emerg("Jump label code mismatch at %pS [%p]\n", ipc, ipc);
arch/s390/kernel/jump_label.c: pr_emerg("Found: %6ph\n", ipc);
arch/s390/kernel/jump_label.c: pr_emerg("Expected: %6ph\n", ipe);
arch/s390/kernel/jump_label.c: pr_emerg("New: %6ph\n", ipn);
arch/s390/kernel/jump_label.c- panic("Corrupted kernel text");
...



another example is hardlockup detector with sysctl_hardlockup_all_cpu_backtrace.

static void watchdog_overflow_callback(...)
{
...
if (is_hardlockup()) {
...
if (sysctl_hardlockup_all_cpu_backtrace &&
!test_and_set_bit(0, &hardlockup_allcpu_dumped))
trigger_allbutself_cpu_backtrace();

nmi_panic(regs, msg);
...
}
...
}

trigger_allbutself_cpu_backtrace() can be much more than 100 lines.
trigger_allbutself_cpu_backtrace() may or may not be implemented via
NMI. for example arch/sparc/kernel/process_64.c

thus, we better avoid skipping any messages when in panic() I think.

-ss