Re: debugging an oops that kills the system

From: Dan Upton
Date: Tue Oct 21 2008 - 15:45:01 EST

Next message: Pavel Machek: "Re: iwl3945: if I leave my machine running overnight, wifi willnot work in the morning"
Previous message: Andrew Morton: "Re: [RFC v7][PATCH 2/9] General infrastructure for checkpointrestart"
In reply to: Roland Dreier: "Re: debugging an oops that kills the system"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Thanks, I'd never come across that but it worked like a charm. Now on
to the hard part, actually figuring out and fixing the bug ;)

-dan

On Tue, Oct 21, 2008 at 2:39 PM, Roland Dreier <rdreier@xxxxxxxxx> wrote:
> > I'm hoping for some pointers on debugging an oops that ultimately
> > hangs the system. I'm doing some scheduler work and I can fairly
> > reliably duplicate the error on my machine, but the output is too
> > large for one screen and the system becomes unresponsive after the
> > crash so I can't scroll the console. I tried purchasing a USB->DB9
> > cable to log to a remote terminal, but so far I haven't had any luck
> > getting that to work. Using kdump/kexec doesn't work either--I got
> > the second kernel to boot successfully using the magic sysrq example
> > in the documentation, but the second kernel doesn't boot with my
> > actual crash. Any other suggestions for what I might do?
>
> If you have two machines (it sounds like you do) and serial console is
> not working for you (could be a setup problem -- do you have a
> "console=" line on your kernel command line?), then netconsole might be
> a good way to debug: Documentation/networking/netconsole.txt
>
> - R.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Pavel Machek: "Re: iwl3945: if I leave my machine running overnight, wifi willnot work in the morning"
Previous message: Andrew Morton: "Re: [RFC v7][PATCH 2/9] General infrastructure for checkpointrestart"
In reply to: Roland Dreier: "Re: debugging an oops that kills the system"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]