Re: Regression: commit 045065d breaks kernel on machine with atapi floppy: high IOWAIT, hung processes (patch exists)

From: Sergio Callegari
Date: Sun Aug 16 2015 - 16:33:13 EST


Patch in https://lkml.org/lkml/2014/11/20/25 fixes the issue for me.

Furthermore, to the best of my understanding it fixes the issue not just for me but for many others too.

Can it please be applied both to the current kernel and to the stable kernels?

Best regards,

Sergio


On 16/08/2015 17:01, Sergio Callegari wrote:
Seems that the issue also affects other systems with different configs:

https://bbs.archlinux.org/viewtopic.php?id=189324

Possibly, the same bug reported in

https://bugzilla.kernel.org/show_bug.cgi?id=87581

A tentative patch was submitted on LKML

https://lkml.org/lkml/2014/11/20/581

I have not tested it yet.

Another possible solution being reported is increasing delay time in blk_delay_queue(q, SCSI_QUEUE_DELAY)

Not tested yet either.

Threads in 189324 suggests that bug is triggered by mixing a slower device with a faster one on the same IDE/SATA channel.

Can someone indicate:

- If one of the two patches has already been accepted in recent kernels or is pending acceptance?

- Which one among the two approaches (extending delay time or modifying spin locks in scsi_lib.c) is more appropriate for me to test?

Best,

Sergio



On 16/08/2015 16:19, Sergio Callegari wrote:

Hi,

please keep me in CC in answers.

I'd like to report that after commit

[045065d8a300a37218c548e9aa7becd581c6a0e8] [SCSI] fix qemu boot hang
problem

the kernel is not usable on a machine with an IOMEGA Zip 100 ATAPI drive
as in:

Model=IOMEGA ZIP 100 ATAPI Floppy, FwRev=12.A, SerialNo=
Config={ SpinMotCtl Removeable nonMagnetic }
RawCHS=0/0/0, TrkSize=0, SectSize=0, ECCbytes=0
BuffType=unknown, BuffSize=unknown, MaxMultSect=0 (maybe): CurCHS=0/0/0,
CurSects=0, LBA=yes, LBAsects=0
IORDY=on/off, tPIO={min:500,w/IORDY:180}
PIO modes: pio0 pio1 pio2 pio3
AdvancedPM=no

Symptoms include:

- Extremely high IOWAIT in absence of load
- Kernel reporting hung processes
- Commands like blkid hanging
- Inability of the machine to shutdown

Symptoms do not appear immediately, but after some time (anywhere
between a few minutes and /many hours/ after boot). First symptom is
IOWAIT suddendly jumping high.

Due to the delay in which symptoms manifest, bisecting has been quite
painful, but I am now rather sure that the first bad commit is the one
above.

Other pieces of hardware configuration include:

- ASRock N68S motherboard with AMD Phenom(tm) II X4 920 Processor and
NVIDIA MCP61 SATA/IDE Chipset
- IDE drive connected as slave on ide interface where master is HL-DT-ST
DVD-RAM GH22NP20 CDROM/DVD writer

Issue is weird because the commit seems to merely fix a trivial error in
logic condition

- if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev))
+ if (!atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev))
blk_delay_queue(q, SCSI_QUEUE_DELAY);

Hence, the commit may just end up making visible some other issue.

Best,

Sergio




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/