Re: Linux serial console patch

From: Herbert Poetzl
Date: Sat Sep 11 2004 - 16:02:28 EST


On Mon, Sep 06, 2004 at 09:46:22AM -0700, Randy.Dunlap wrote:
> On Mon, 6 Sep 2004 16:52:30 +0100 Russell King wrote:
>
> | On Mon, Sep 06, 2004 at 04:45:22PM +0100, James Courtier-Dutton wrote:
> | > Russell King wrote:
> | > > On Mon, Sep 06, 2004 at 10:32:27AM +0000, Danny ter Haar wrote:
> | > >
> | > >>James Courtier-Dutton <James@xxxxxxxxxxxxxxxxxxxx> wrote:
> | > >>
> | > >>>>I have read your posts to lkml containing your serial console flow control
> | > >>>>patches firstly for 2.4.x and then for 2.6.x kernels.
> | > >>>
> | > >>>Does this fix junk being output from the serial console?
> | > >>>If one is using Pentium 4 HT, it seems that both CPU cores try to send
> | > >>>characters to the serial port at the same time, resulting in lost
> | > >>>characters as one CPU over writes the output from the other.
> | > >>
> | > >>We have multiple P4-HT enabled servers with debian installed & serial
> | > >>console enabled (RPB++ ;-) and _i_ have never seen this behaviour.
> | > >
> | > >
> | > > I don't think this is a serial problem as such, but a problem with the
> | > > kernel console subsystem (printk) itself. Maybe James can provide an
> | > > example output to confirm exactly what he's seeing.
> | > >
> | >
> | > http://www.superbug.demon.co.uk/latency/
> | >
> | > There are 2 oops traces there. At about line 176, the corruption starts.
> |
> | They both look like a two printk's overlapping each other, which isn't
> | unreasonable since one is an oops. We try real hard to get oopses
> | out, which means "busting" the printk spinlocks. The side effect of
> | busting those spinlocks is of course no console locking.
> |
> | It may be annoying, but unless some SMP person wants to fix the spinlock
> | busting to be a little more inteligent, you can expect this situation
> | to continue.
>
> I've seen it enough times that I would like to see it fixed,
> and David Howells (RH) has posted a patch for it several times.
>
> I'm woondering if this:
> http://linux.bkbits.net:8080/linux-2.5/diffs/kernel/printk.c@xxxx?nav=index.html|src/|src/kernel|hist/kernel/printk.c
> is supposed to be a patch for the problem in the 'latency' log above.
> If so, it's not as good a solution as David Howells's patch is.
> His latest (AFAIK) is:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=105730993512692&w=2

here is an updated version:

--- ./kernel/printk.c.orig 2004-09-10 23:56:11.000000000 +0200
+++ ./kernel/printk.c 2004-09-11 22:56:33.000000000 +0200
@@ -521,6 +521,8 @@ asmlinkage int printk(const char *fmt, .
return r;
}

+static volatile int printk_cpu = -1;
+
asmlinkage int vprintk(const char *fmt, va_list args)
{
unsigned long flags;
@@ -529,11 +531,12 @@ asmlinkage int vprintk(const char *fmt,
static char printk_buf[1024];
static int log_level_unknown = 1;

- if (unlikely(oops_in_progress))
+ if (unlikely(oops_in_progress && printk_cpu == smp_processor_id()))
zap_locks();

/* This stops the holder of console_sem just where we want him */
spin_lock_irqsave(&logbuf_lock, flags);
+ printk_cpu = smp_processor_id();

/* Emit the output into the temporary buffer */
printed_len = vscnprintf(printk_buf, sizeof(printk_buf), fmt, args);


HTH,
Herbert

> --
> ~Randy
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/