Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

From: Klaus S. Madsen
Date: Thu Feb 28 2008 - 04:28:59 EST


On Thu, Feb 28, 2008 at 10:16:39 +0100, Ingo Molnar wrote:
>
> * Klaus S. Madsen <ksm@xxxxxxxxxxxxxxxx> wrote:
>
> > Hi all,
> >
> > I have a Thinkpad T61p, which I'm able to suspend with s2ram
> > on Linux 2.6.24.3. However when I try to suspend it on 2.6.25-rc3,
> > s2ram dies after changing to vt1, with a segfault. I'm using s2ram
> > from cvs, and libx86 version 0.99 from
> > http://www.codon.org.uk/~mjg59/libx86/.
> >
> > Some details about the segfault:
> >
> > $ sudo gdb ./s2ram
> > (gdb) run
> > Starting program: /home/ksm/downloads/suspend/s2ram
> > Switching from vt7 to vt1
> > Calling get_mode
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0xb7facf4a in run_vm86 () at lrmi.c:526
> > 526 asm volatile (
> > (gdb) list
> > 521 static int
> > 522 lrmi_vm86(struct vm86_struct *vm)
> > 523 {
> > 524 int r;
> > 525 #ifdef __PIC__
> > 526 asm volatile (
> > 527 "pushl %%ebx\n\t"
> > 528 "movl %2, %%ebx\n\t"
> > 529 "int $0x80\n\t"
> > 530 "popl %%ebx"
> > (gdb) bt
> > #0 0xb7facf4a in run_vm86 () at lrmi.c:526
> > #1 0xb7fad61b in LRMI_int (i=16, r=0xbffca670) at lrmi.c:844
> > #2 0x0804acfc in do_vbe_service (AX=20227, BX=0, regs=0xbffca670)
> > at vbetool/vbetool.c:158
> > #3 0x0804af7e in __get_mode () at vbetool/vbetool.c:453
> > #4 0x0804a30f in s2ram_hacks () at s2ram-x86.c:268
> > #5 0x0804954f in main (argc=1, argv=0x0) at s2ram-main.c:92
> >
> > I have tried to bisect the problem, and it fingered the following
> > commit:
> >
> > commit 82bc03fc158e28c90d7ed9919410776039cb4e14
> > Author: Ingo Molnar <mingo@xxxxxxx>
> >
> > x86: add PWT to NOCACHE flags
> >
> > Reverting this commit in the bisected tree (by executing git show
> > 82bc03fc158e28c90d7ed9919410776039cb4e14 | patch -R -p1), makes the
> > segfault go away. I've run make clean between each kernel compile, to
> > be sure the tree was correctly compiled.
>
> thanks for tracking this down. It would be nice to figure out why this
> change made a difference. Perhaps VM86 mode has some restrictions in
> what type of pagetables it can operate in - and the CPU just refuses to
> properly emulate those 16-bit instructions? (this would be very weird).
> We are trying to execute 16-bit BIOS code here, right?
>
> which instruction is the segfault coming from - the int $0x80? So in
> vm86 mode we generated a #GPF which shows up as a SIGSEGV?
I must say, that I don't quite understand why gdb fingers the "asm
volatile" line and not one of the assembly lines, when reporting the
segfault. But I'm not really well versed in lowlevel gdb use, so if you
could give me a about how I get gdb to disassemble the code at the
instruction pointer, I'll return with the result.

Thanks for taking the time to look at this.

--
Kind regards
Klaus S. Madsen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/