Re: 2.1.103: Still "Ugh at c0111691"

Jean Wolter (jw5@os.inf.tu-dresden.de)
22 May 1998 12:15:18 +0200


deas@uni-hamburg.de (Andreas Steffan) writes:

>
> Hi
>
> I'm running v2.1 since 2.1.98, and I get these "Ugh at xxxxxxxx"
> messages with all kernels >=2.1.98. No problems with 2.0.33.
> I've build the kernels with gcc-2.8.1, but the problem is not gcc
> related. I've build 2.1.103 with gcc-2.7.2.3, and still get these
> messages, i.e. (from 2.1.103):
>
> May 21 20:47:39 mortimer kernel: Ugh at c0111691
>
> Looking at /System.map:
>
> c011162c T do_page_fault
> c01119f0 t remap_area_pages
>
> I deduce, that the problem is in do_page_fault.
> The box was idle in most cases these messages appeared, but there
> seems to be a certain correlation (from 2.1.102):
> [...]
>
> David Woodhouse told me that something caused a page fault with
> interupts disabled, so here is some more information:

Maybe you can apply the appended patch. It tries to generate
additional output for an Ugh using show_registers. If you catch an Ugh
again you should have enough information to use ksymoops to track it
down to the responsible function.

Unfortunately I was not able to test it thoroughly since I wasn't
able to raise an Ugh. But I have inserted generate_ugh_trace at some
point in the boot sequence and it happily shows a nice trace.

Jean

PS: I'm not quite sure, whether I have to grab the spin lock before
calling show_registers, maybe someone else can comment on that?

diff -u -r linux/arch/i386/kernel/entry.S linux-new/arch/i386/kernel/entry.S
--- linux/arch/i386/kernel/entry.S Tue May 12 22:13:15 1998
+++ linux-new/arch/i386/kernel/entry.S Fri May 22 10:35:54 1998
@@ -111,6 +111,41 @@
movl %esp, reg; \
andl $-8192, reg;

+ENTRY(generate_ugh_trace)
+ movl (%esp), %eax # get eip
+ pushfl
+ pushl $(__KERNEL_CS)
+ pushl %eax # create exception frame
+ pushl %eax # orig eax
+
+ pushl $(__KERNEL_DS)
+ pushl $(__KERNEL_DS)
+
+ pushl %eax
+ pushl %ebp
+ pushl %edi
+ pushl %esi
+ pushl %edx
+ pushl %ecx
+ pushl %ebx
+ pushl %esp
+ call SYMBOL_NAME(print_ugh_trace)
+ addl $4, %esp
+ popl %ebx
+ popl %ecx
+ popl %edx
+ popl %esi
+ popl %edi
+ popl %ebp
+ popl %eax
+
+ addl $24,%esp # remove seg regs, orig eax, exception frame
+ ret
+
+
+
+
+
ENTRY(lcall7)
pushfl # We get a different stack layout with call gates,
pushl %eax # which has to be cleaned up later..
diff -u -r linux/arch/i386/kernel/traps.c linux-new/arch/i386/kernel/traps.c
--- linux/arch/i386/kernel/traps.c Tue May 19 19:00:07 1998
+++ linux-new/arch/i386/kernel/traps.c Fri May 22 11:28:10 1998
@@ -179,6 +179,13 @@
printk("\n");
}

+asmlinkage void print_ugh_trace(struct pt_regs *regs)
+{
+ printk("Ugh: process %s touched user space while interrupts disabled\n",
+ current->comm);
+ show_registers(regs);
+}
+
spinlock_t die_lock;

void die_if_kernel(const char * str, struct pt_regs * regs, long err)
diff -u -r linux/include/asm-i386/smp_lock.h linux-new/include/asm-i386/smp_lock.h
--- linux/include/asm-i386/smp_lock.h Sat May 9 03:21:34 1998
+++ linux-new/include/asm-i386/smp_lock.h Fri May 22 12:02:03 1998
@@ -40,6 +40,7 @@


extern const char lk_lockmsg[];
+int generate_ugh_trace(void);

/* Locking the kernel */
extern __inline__ void lock_kernel(void)
@@ -53,6 +54,7 @@
if (cpu == global_irq_holder) {
__label__ l2;
l2: printk("Ugh at %p\n", &&l2);
+ generate_ugh_trace();
sti();
}

-- 
I get up each morning, gather my wits.
Pick up the paper, read the obits.
if I'm not there I know I'm not dead.
So I eat a good breakfast and go back to bed. Peete Seeger

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu