Re: [2.6.17-rc5-mm2] crash when doing second suspend: BUG in arch/i386/kernel/nmi.c:174

From: Nigel Cunningham
Date: Tue Jun 06 2006 - 20:38:19 EST


Hi.

On Wednesday 07 June 2006 10:33, Andi Kleen wrote:
> On Wednesday 07 June 2006 02:24, Andrew Morton wrote:
> > On Wed, 7 Jun 2006 10:13:49 +1000
> >
> > Nigel Cunningham <ncunningham@xxxxxxxxxxxxx> wrote:
> > > > the new CPU to get the same state as the old one just because it ends
> > > > up with the same logical CPU number? Perhaps, but what if it doesn't
> > > > even have the same capabilities? (Do we support heterogeneous CPUs
> > > > anyway?)
> > >
> > > Indeed. I'm also not sure that there's necessarily a guarantee that
> > > cpus will be hotplugged in the same order. Perhaps those with more
> > > knowledge can clarify there.
> >
> > It all depends on what we mean by "per-cpu state". If we were to
> > remember that "CPU 7 needs 0x1234 in register 44" then that would be
> > wrong. But remembering some high-level functional thing like "CPU 7
> > needs to run the NMI watchdog" is fine. The CPU bringup code can work
> > out whether that is possible, and how to do it.
>
> Actually the nmi watchdog state should be global, not per CPU. We
> want it to either work for the whole system or be completely disabled.

Ok. Now I get and fully agree with what you said earlier ("Make it work
properly for CPU hotplug for individual CPU and then in suspend
you take care of "global" state and the last CPU.").

> What is per CPU are the performance counter allocations, but these
> can be forgotten over CPU unplug/replug.
>
> (ok this means oprofile might need to be restarted after suspend/resume,
> but I guess that's reasonable)

Don't know enough in that area to say anything :>

Regards,

Nigel
--
Nigel, Michelle and Alisdair Cunningham
5 Mitchell Street
Cobden 3266
Victoria, Australia

Attachment: pgp00000.pgp
Description: PGP signature