Re: [BUG] ide dma_timer_expiry, then hard lockup

From: Sergei Shtylyov
Date: Tue Jun 19 2007 - 12:08:51 EST


Hello.

Linas Vepstas wrote:

Stuart_Hayes@xxxxxxxx wrote:

I think reading the IDE status register clears the interrupt in the IDE
device, which might be causing the drive to think it's OK to generate
another interrupt.

This is not how IDE drives are supposed to act -- they won't proceed any further until "interrupt pending" condition is cleared, so these aren't supposed to be "stacked". This behavior however is not strictly specified by ATA standards IIRC, but I can't readily imagine such situaltion anyway unless tagged command queueing (which is not supported by IDE core) and/or ATAPI command overlapping is in action...

The problem only manifests during high io load; perhaps a missing mutex
somewhere is blasting one thing too many out to the hard drive?

Hm... not sure about this.

This could either cause it to get stuck trying to
service an interrupt that is never getting cleared as you suggested, or
possibly when the next IRQ comes in the IDE IRQ handler gets stuck
waiting for a spinlock that the code you're looking at already owns...?

I could also imagine the HPT366 chip going mad and stalling the reads if the taskfile regs forever because of the incomplete DMA or even the drive going mad and not replying to I/O cycles with proper -IORDY handshake (i.e. holding it low all the time)...

In my case, ctrl-alt-sysrq doesn't work, which makes it hard to debug.

I'm thinking that trying to debug libata is a better idea, rather than
investing time in ide, right? Although at the moment, libata works even less; see other email.

Which makes me think this really is some *hardware* issue.

--linas

MBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/