Re: [git pull] kgdb-light -v10

From: Linus Torvalds
Date: Tue Feb 12 2008 - 11:48:29 EST




On Tue, 12 Feb 2008, Ingo Molnar wrote:
>
> * Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> >
> > Stopping all CPUs for indefinite time very much seems like "breaking a
> > correctly working system" to me. [...]
>
> well, this is a small detail, but still you are wrong, and on a
> correctly working system this will not occur. (if yes, tell me how)

Quite frankly, I don't see why the kernel kgdb layer should have *any*
code like this at all.

The one who is actually debugging is the one who should decide which CPU's
get stopped, and which don't.

I realize that the gdb remote protocol is probably a piece of crap and
cannot handle that, but hey, that's not my problem, and more importantly,
I don't think it's even a *remotely* valid reason for making bad decisions
in the kernel. gdb was still open source last time I saw, and I think it's
reasonable to just say:

- the kgdb commands should always act on the *current* CPU only
- add one command that says "switch over to CPU #n" which just releases
the current CPU and sends an IPI to that CPU #n (no timeouts, no
synchronous waiting, no nothing - it's like a "continue", but with a
"try to get the other CPU to stop"

Yes, other CPU's will obviously often end up stopping due to waiting for
some spinlock or other if we stop one, but that's a separate issue, and
quite often it might be sufficient - and what we want.

And yes, you'd likely have to add some support to gdb to make this
_usable_, but now all that usability crap, all those timeouts for "stop
all CPU's" are now in user space on the _debugger_ side. That can be as
fancy as it wants to be.

And maybe this isn't realistic. I'm not saying "we _must_ do it this way",
I just want to say that the kernel kgdb layer should be as thin ass
humanly possible, and maybe the right thing to do is to simply totally
punt on the whole "stop other cpu's" issue and make it a debugger-side
question.

In other words, is it perhaps possible to just *get*rid*of* that
"kgdb_active" and "nmicallback" and the whole multi-CPU roundup? Just use
a kgdb spinlock around the stuff that actually sends and receives
individual packets, and expect the debugger side to sort them out (yeah, I
suspect this involves having to add the CPU ID to each packet).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/