Re: Regression in suspend to ram in 2.6.31-rc kernels

From: Rafael J. Wysocki
Date: Fri Sep 11 2009 - 18:03:43 EST


On Friday 11 September 2009, OGAWA Hirofumi wrote:
> Pavel Machek <pavel@xxxxxx> writes:
>
> > On Wed 2009-09-09 22:21:56, OGAWA Hirofumi wrote:
> >> Pavel Machek <pavel@xxxxxx> writes:
> >>
> >> >> It seems
> >> >>
> >> >> 1) sync() (probabry "sync" command)
> >> >> 2) sync as part of suspend sequence
> >> >> 3) sync_filesystem() by mmc remove event
> >> >>
> >> >> I guess the root-cause of the problem would be 3). However, it would not
> >> >> be easy to fix, at least, we would need to think about what we want to
> >> >> do for it. So, to workaround it for now, I've made this patch.
> >> >
> >> > MMC driver trying to synchronize filesystems looks like ugly layering
> >> > violation to me. Why are we doing that?
> >>
> >> There is no _layering violation_ here. IIRC, mmc just tells card removed
> >> event to another layer (on some points of view, to tell event can be
> >> wrong though). The partition (block) layer does it by event.
> >
> > So what is the problem? Emulating sync when card is already removed
> > seems little ... interesting?
>
> Um..., sorry, I'm not sure what are you talking about. Of course, the
> problem of this is that system freeze on suspend.
>
> Or are you asking my guess of the cause, or something? If so, although
> I'm not reading all emails on this thread, from Zdenek's backtrace, the
> sequence would be
>
> 1) suspend mmc
> 2) mmc generates card removed event

Which shouldn't happen.

> 3) prepare to invalidate blockdev
> 4) sync fs on invalidating blockdev
> 5) flush buffers on invalidating blockdev (partitions)
> 6) delete blockdev (partitions)
>
> or like the above. And I can guess some possible issues/root-cause we
> have to handle from it.
>
> a) card removed event from mmc for suspend is right design?

Not with the current suspend/resume design.

> b) the card can be changed/removed before system was resumed, mmc
> can be detect/handle it properly?
> c) flushing buffers on _deleted_ device is right design?
>
> and I suspect there are more issues in detail and resume process though.

Well, first, there's a limit to which file systems can ignore the
suspend/resume process and we're hitting it right now.

Second, we need a general solution for handling file systems over
suspend/resume _and_ possibly removable devices that can be gone while
suspended. We don't have any solution like this right now and I have a little
experience with file systems, so I'm not going to take care of this in the
foreseeable future. If someone else can, that's going to be appreciated very
much.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/