Re: [patch] IDE problems on SMP, fixed? (fwd)

MOLNAR Ingo (mingo@chiara.csoma.elte.hu)
Wed, 29 Jul 1998 23:33:02 +0200 (CEST)


On Wed, 29 Jul 1998, Linus Torvalds wrote:

> > Between the two places we only have a couple of non-functional if()'s
> > and assignments to static variables on the stack.
>
> Note that it looks like Ingo by mistake removed the ide__sti completely:

no it was completely intentional :)

> and note how "DISK_RECOVERY_TIME" is usually defined to be zero unless
> explicitly asked to be something else. So the place Ingo moved it to will
> never even be compiled into the kernel, so the interrupt enable is never
> done.
>
> Ingo, if you can reproduce this consistantly, I'd suggest moving the sti
> down to a place where it actually triggers, until you find the place where
> it really fixes the problem..

yes i've been doing this for the last 2 hours :) _so far_ it looks like if
we _ever_ enable interrupts 'within' that critical ide.c part, i get a
lockup sooner or later. I've moved the sti down and out of start_request,
until the next place where we disable IRQs again, and lockups always. The
only case where i got no lockup was when i removed the sti completely.

it might be unrelated, but that part of ide.c was always very critical,
similar to ne.c, as it enables/disables itself from within an IRQ handler,
this used to pop up more subtle IOAPIC bugs. but this one does not look
like the typical IRQ deadlock.

i've also double-checked that other system parameters are invariant:
replacing the ide_sti() by a cli (same length instruction, almost the same
timings, very same cache layout), but no lockup. various BIOS settings
make no difference.

-- mingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html