SUMMARY: DL760 G2 issue with Hyper-Threading: Problem Solved !

From: Bruno Cornec (Bruno.Cornec@hp.com)
Date: Tue Jul 01 2003 - 13:23:11 EST


Hello,

Here is an attempt to summarize all the actions done around that issue
to try and help solve the problem.

PROBLEM: DL760 G2 issue with Hyper-Threading (HT)

When activating HT on the machine in the BIOS, the system boots correctly
as it seems (detecting the 16 CPUs it should detect) but then it hangs
during the remaining phases of the boot. The base distribution is SLES 8.

On my machine, I made the following tests, with the following results:
In every case HT is selected in the BIOS, CONFIG_HIGHMEM64G=y, CONFIG_HIGHMEM=y,
CONFIG_X86_PAE=y (The machine has 16GB of RAM)

The latest BIOS has been put on the system (P44 - 2003-02-18)

Kernel config Status
====== ====== ======
k_smp-2.4.19-113 apm=off acpi=off noapic Unstable => Hang (no message)
k_smp-2.4.19-257 apm=off acpi=off noapic Unstable => Hang (no message)
2.4.20-18.9bigmem apm=off acpi=off noapic Unstable => Hang (no message)
2.4.21 apm=off acpi=off noapic Unstable => Hang (no message)
k_smp-2.4.19-113 apm=off acpi=off Kernel Panic during boot (*)
k_smp-2.4.19-257 apm=off acpi=off Kernel Panic during boot (*)
2.4.20-18.9bigmem apm=off acpi=off Kernel Panic during boot (*)
2.4.21 apm=off acpi=off Kernel Panic during boot (*)
2.4.21-ac2 CONFIG_ACPI=n Stable with only 8 CPUs detected
                                        CONFIG_X86_CLUSTERED_APIC=y
2.4.21-ac2 +CONFIG_ACPI=y alone print kernel boots (1 line on screen)
2.4.21-ac2 CONFIG_PM=y Kernel Panic during boot (*)
2.4.21-ac3/4+bruno CONFIG_PM=y Kernel Panic during boot (*)
                    Every ACPI MPS table in Bios gives the same
2.4.21-ac4+bruno With attached .config No problem !!

So it seems that to have a working system, you need:

2.4.21-ac4 + the patch sent by Venkatesh Pallipadi (attached again) + a right .config
(attached as well) containing at least:

CONFIG_SMP=y
CONFIG_X86_CLUSTERED_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_ACPI=y
CONFIG_ACPI_HT_ONLY=y
CONFIG_ACPI_BOOT=y

And no CONFIG_PM set.

The proof:
linux:~ # cat /proc/cpuinfo | grep processor |wc -l
     16
:-))
The full boot trace is attached (trace.ok)

BTW, without your latest patch, the system is hanging as before, and as I was
able to connect another machine on the serial port, I now have a trace in
case you were interested (attached as trace.nok).

Thanks a lot to Venkatesh Pallipadi for the time taken to solve the issue.

Bruno.

-- 
Linux Solution Consultant         Tél: +33 476 143 278 - Fax: +33 476 146 105
HP/Intel Solution Center http://hpintelco.net Hewlett-Packard Grenoble/France
Des infos sur Linux?  http://www.HyPer-Linux.org      http://www.hp.com/linux
La musique ancienne?  http://www.musique-ancienne.org http://www.medieval.org





- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 07 2003 - 22:00:13 EST