Re: linux-next: Tree for June 13: IO APIC breakage on HP nx6325

From: Rafael J. Wysocki
Date: Mon Jun 16 2008 - 19:04:53 EST


On Tuesday, 17 of June 2008, Rafael J. Wysocki wrote:
> On Monday, 16 of June 2008, Maciej W. Rozycki wrote:
> > On Mon, 16 Jun 2008, Rafael J. Wysocki wrote:
> >
> > > > > commit 7e3530cd98a0c6ab38f5898e855a5beffab26561
> > > > > Author: Maciej W. Rozycki <macro@xxxxxxxxxxxxxx>
> > > > > Date: Tue May 27 21:19:51 2008 +0100
> > > > >
> > > > > x86: I/O APIC: timer through 8259A second-chance
> > > > >
> > > > > Signed-off-by: Maciej W. Rozycki <macro@xxxxxxxxxxxxxx>
> > > > > Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
> > > >
> > > > Can I have .config used and a full bootstrap log from that system with
> > > > the patch still applied?
> > >
> > > That may be difficult, because with the patch applied the box either doesn't
> > > boot at all, or works unreliably when booted (depending on the set of patches
> > > applied on top of it).
> >
> > Serial console?
>
> No, this box doesn't have any serial ports. It has a FireWire one, but I don't
> have a matching cable ...
>
> > I'm most interested in one from a configuration that
> > does not boot at all as that's easier to reproduce, determine the cause
> > and verify whether a change fixes the problem or not. Other
> > configurations may then be tested with the fix in place.
>
> With the -next from today (20080616) I get a different picture.
>
> Without any patches on top it boots, but the fan is turned 100% on as soon as
> the ACPI modules get loaded, regardless of the temperature (normally it does
> that above 75^o C, which is impossible to get normally, because there are 3
> temperature trip points below that level; generally the hardware only does that
> when overheating). After that, things start to go _very_ slow, like 10x slower
> than usually in X and somewhat slower in the fb console, but I was able to get
> a dmesg output. This is reproducible 100% of the time.
>
> With commit 7e3530cd98a0c6ab38f5898e855a5beffab26561 reverted the box seems to
> work normally. However, while I was writing this message, ACPI decided it was
> overheating and emergency shut down the box, although that was completely
> wrong. Next time I'll try with the C1E patches reverted.
>
> The .config is at: http://www.sisk.pl/kernel/debug/20080616/next-config
>
> dmesg output without any patches is at
> http://www.sisk.pl/kernel/debug/20080616/dmesg-1.log
>
> dmesg output with commit 7e3530cd98a0c6ab38f5898e855a5beffab26561 reverted is
> at: http://www.sisk.pl/kernel/debug/20080616/dmesg-2.log
>
> (they look pretty similar to my untrained eye, but well).

BTW, with the C1E patches reverted I don't get the
WARNING: at /home/rafael/src/linux-next/kernel/smp.c:215 smp_call_function_single+0x3d/0xa2
in the log. Thomas?

dmesg with commit 7e3530cd98a0c6ab38f5898e855a5beffab26561 and with the C1E
commits (the ones between 8750bf598db6a0ea3475d1cf8da922b325941e12 and
aa83f3f2cfc74d66d01b1d2eb1485ea1103a0f4e inclusive) reverted is at:
http://www.sisk.pl/kernel/debug/20080616/dmesg-3.log

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/