Re: [linux-pm] [PATCH] Fix the outstanding issue with hangs oninsert/removal of mmc cards

From: Alan Stern
Date: Fri Jun 11 2010 - 17:00:59 EST


On Fri, 11 Jun 2010, Maxim Levitsky wrote:

> Hi,
>
> After thinking a lot about how to fix properly the hangs caused by
> insert/removal of mmc card during suspend/resume, and default behavior
> of not trusting the card persistence over suspend, I finally come to
> conclusion that changing the del_gendisk is wrong.
>
> First of all there are 2 types of removal possible. First one happens
> when system detects that some device is gone. At that point there is
> really no point in syncing it.
>
> The other type of removal is controlled removal, usually on user
> request. Surly we must sync the device of this request.
> This type of removal _shouldn't_ happen during suspend/resume
> transaction. The only case when it does is today to protect against user
> carelessness of switching the cards during suspend.

There are other pathological cases which can cause it to happen, but
they are pretty unlikely.

> I think that it is just wrong to sync the device in suspend/resume time.
> At that time userspace is frozen, but also its not known which drivers
> are still running. They might even suspend asynchronously...
> So, such cases should be moved to pm-notifier, thing that my patch does
> for mmc.
> Other users should be fixed as well.
>
> We can, in addition to that, add a temporary hack to del_gendisk with
> loud WARN_ON though.
>
> If card is really removed during suspend, then we can just introduce
> del_gendisk_dead or something like that which will be safe to call
> during suspend.
>
> I didn't do that but rather I made the card detection thread freezeable,
> thus eliminated the whole problem.
> If you remove the card during suspend, system will notice at end of
> resume.

I don't know why the mmc subsystem works differently from USB. In USB,
the equivalent of UNSAFE_RESUME is a per-device flag that can be
controlled via sysfs (see Documentation/usb/persist.txt). And it
almost always defaults to ON, i.e., the kernel assumes that if a device
is present before suspend and after resume it is the same device --
although some checking is done to try to verify this (the descriptors
have to remain the same). We started out being more cautious (the
default was OFF), but Linus complained about it being _too_ cautious.

And like you have done here, in USB the kernel thread that handles
registering and unregistering devices is freezable, so things never get
added or removed at an unsafe time.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/