Re: [PATCH] libata: Forcing PIO0 mode on reset must not freezesystem

From: Holger Macht
Date: Mon Feb 11 2008 - 05:04:21 EST


On Mon 11. Feb - 11:37:15, Tejun Heo wrote:
> Hello,
>
> Holger Macht wrote:
> > Calling ap->ops->set_piomode(ap, dev) on a device/controller which got
> > already removed, locks the system hard. Reproducibly on an X60 attached to
> > a dock station containing a cdrom device with doing
> >
> > $ echo 1 > /sys/devices/platform/dock.0/undock && echo 123 > /dev/sr0
> >
> > This calls ata_eh_reset(...) which in turn tries to force PIO mode 0. But
> > the device is already gone.
> >
> > Bisecting revealed the following commit as culprit:
> >
> > commit cdeab1140799f09c5f728a5ff85e0bdfa5679cd2
> > Author: Tejun Heo <htejun@xxxxxxxxx>
> > Date: Mon Oct 29 16:41:09 2007 +0900
> >
> > libata: relocate forcing PIO0 on reset
> >
> > Forcing PIO0 on reset was done inside ata_bus_softreset(), which is a
> > bit out of place as it should be applied to all resets - hard, soft
> > and implementation which don't use ata_bus_softreset(). Relocate it
> > such that...
> >
> > * For new EH, it's done in ata_eh_reset() before calling prereset.
> >
> > * For old EH, it's done before calling ap->ops->phy_reset() in
> > ata_bus_probe().
> >
> > This makes PIO0 forced after all resets. Another difference is that
> > reset itself is done after PIO0 is forced.
> >
> > Signed-off-by: Tejun Heo <htejun@xxxxxxxxx>
> > Acked-by: Alan Cox <alan@xxxxxxxxxx>
> > Signed-off-by: Jeff Garzik <jeff@xxxxxxxxxx>
> >
> >
> > ATTENTION! The following patch solves the problem on my system, but please
> > be aware that I don't really know what I'm doing because I don't have the
> > big picture. There's surely a better way to check if the device/controller
> > is still functional than calling ata_link_{online,offline}.
>
> In the above example, even the reset sequence itself can cause hang if
> the hardware is implemented slightly differently. The reason why
> set_piomode() locks up but reset sequence doesn't is simple dumb luck.
> I think the proper fix is to tell libata to detach the cdrom before
> undocking.

Wouldn't the proper fix be to call ata_acpi_handle_hotplug _somewhere_?
(which is currently called nowhere AFAICS)

Anyway, kernel hackers keep telling me that the kernel should just do the
right thing. Shouldn't userspace never be able to freeze the system?

It's completely ok for me to handle this from userspace, if that's the
position of the libata developers.

In this case, we should change the dock driver to default to
immediate_undock=false, because otherwise it's far too risky to freeze the
system.

Regards,
Holger
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/