Re: debugging an oops that kills the system

From: Roland Dreier
Date: Tue Oct 21 2008 - 14:39:39 EST


> I'm hoping for some pointers on debugging an oops that ultimately
> hangs the system. I'm doing some scheduler work and I can fairly
> reliably duplicate the error on my machine, but the output is too
> large for one screen and the system becomes unresponsive after the
> crash so I can't scroll the console. I tried purchasing a USB->DB9
> cable to log to a remote terminal, but so far I haven't had any luck
> getting that to work. Using kdump/kexec doesn't work either--I got
> the second kernel to boot successfully using the magic sysrq example
> in the documentation, but the second kernel doesn't boot with my
> actual crash. Any other suggestions for what I might do?

If you have two machines (it sounds like you do) and serial console is
not working for you (could be a setup problem -- do you have a
"console=" line on your kernel command line?), then netconsole might be
a good way to debug: Documentation/networking/netconsole.txt

- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/