Re: 2.6.19-rc6-rt4, changed yum repository

From: Karsten Wiese
Date: Sun Nov 19 2006 - 15:56:58 EST


Am Sonntag, 19. November 2006 14:43 schrieb Ingo Molnar:
>
> * Karsten Wiese <fzu@xxxxxxxxxxxxxxxxxxxxx> wrote:
>
> > work_resched:
> > DISABLE_INTERRUPTS
> > call __schedule
> > # make sure we don't miss an interrupt
> > # setting need_resched or sigpending
> > # between sampling and the iret
> > movl TI_flags(%ebp), %ecx
> > andl $_TIF_WORK_MASK, %ecx # is there any work to be done other
> > # than syscall tracing?
> > jz restore_all
> > testl $(_TIF_NEED_RESCHED|_TIF_NEED_RESCHED_DELAYED), %ecx
> > jnz work_resched
> >
> > The hwclock page_fault happens at the
> > movl TI_flags(%ebp), %ecx
> > line.
>
> hm, weird - maybe something corrupts %ebp here? Could you try to add
> this to before the faulting instruction:
>
> GET_THREAD_INFO(%ebp)
>
> this will make sure %ebp has the right contents.

Doesn't make a difference:
clock is still set an hour early during boot occasionally.
An hour offset I also get when I comment out the hwclock call in
rc.sysinit.

The Sysrq+T output with GET_THREAD_INFO(%ebp) has:
=======================
hwclock R [f7f76550] C1B07224 [on rq #0] 0 329 304 (NOTLB)
f7f6efb4 00003086 c1907434 c1b07224 c1907434 c02d320f 00000000 00000000
00000001 f7f7667c f7f76550 ad91991e 00000008 001349cd c02ef1fe 00000004
d1292e17 00000000 000cc113 00000000 00000000 f7f6e000 c0102f22 000cc113
Call Trace:
[<c02d320f>] do_page_fault+0x2b9/0x552
[<c0102f22>] work_resched+0x6/0x20
=======================

The [<c0102f22>] work_resched+0x6/0x20 corresponds to
mov $0xfffff000,%ebp
in:
(gdb) disassemble work_resched
Dump of assembler code for function work_resched:
0x000001c0 <work_resched+0>: cli
0x000001c1 <work_resched+1>: call 0x1c2 <work_resched+2>
0x000001c6 <work_resched+6>: mov $0xfffff000,%ebp
0x000001cb <work_resched+11>: and %esp,%ebp
0x000001cd <work_resched+13>: mov 0x8(%ebp),%ecx
0x000001d0 <work_resched+16>: and $0xfe3e,%ecx
0x000001d6 <work_resched+22>: je 0x16a <restore_all>
0x000001d8 <work_resched+24>: test $0x80008,%ecx
0x000001de <work_resched+30>: jne 0x1c0 <work_resched>
End of assembler dump.

But "mov $0xfffff000,%ebp" can't cause a pagefault.
So either the Sysrq+T output is wrong or the actual page_fault happens
inside the "call __schedule" with __schedule missing from the
Call Trace.

Your yum-repo kernel seams to stay clear of above problem.
Obvious differences to my .config are SMP <> UP
and M686+X86_GENERIC <> MK8.

Karsten


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/