Re: Getting rid of freezer for suspend [was Re: [fuse-devel] [PATCH]fuse: make fuse daemon frozen along with kernel threads]

From: Miklos Szeredi
Date: Mon Feb 11 2013 - 09:00:05 EST


On Mon, Feb 11, 2013 at 1:08 PM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
> On Monday, February 11, 2013 11:11:40 AM Miklos Szeredi wrote:
>> On Mon, Feb 11, 2013 at 12:31 AM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
>> > On Sunday, February 10, 2013 07:55:05 PM Pavel Machek wrote:
>>
>> >> Well, from freezer you need:
>> >>
>> >> 1) user process frozen.
>> >>
>> >> 2) essential locks _not_ held so that block devices are still functional.
>> >>
>> >> > > > mmap... what is problem with mmap? For suspend, memory is powered, so
>> >> > > > you can permit people changing it.
>> >> > >
>> >> > > Suppose mmap is used to make the registers of some device available to user
>> >> > > space. Yes, that can happen.
>> >>
>> >> "Don't do it, then". Yes, can happen, but hopefully is not too common
>> >> these days. [And... freezer doing 1) but not 2) would be enough to
>> >> handle that. Freezer doing 1) but not 2) would also be simpler...]
>> >
>> > Again, I'm not sure what you mean.
>> >
>> > Are you trying to say that it would be OK to freeze user space tasks in
>> > the D state?
>>
>> I think that's what Pavel is saying. Processes in D state sleeping
>> on non-device mutexes _are_ actually OK to freeze. And that would
>> nicely solve the fuse freeze problem.
>
> That's potentially deeadlock-prone, because a task waiting for mutex X may
> very well be holding mutex Y, so if there's another task waiting for mutex Y,
> it needs to be frozen at the same time.
>
>> The only little detail is how do we implement that...
>
> This means the only way I can see would be to hack the mutex code so that the
> try_to_freeze() was called for user space tasks after the
> sched_preempt_enable_no_resched() before schedule().
>
> That shouldn't be a big deal performance-wise, because we are in the slow
> path anyway then. I'm not sure if Peter Z will like it, though.
>
> Moreover, a task waiting for a mutex may be holding a semaphore or be
> participating in some other mutual-exclusion mechanism, so we'd need to
> address them all. Plus, as noted by Pavel, freezing those things would make
> it difficult to save hibernation images to us.

Well, as a first step I could cook up a patch that adds a flag to the
mutex indicating that it's freezable. Fuse would mark its mutexes
(and the mutexes that VFS uses on its behalf) as freezable. That way
we don't interfere with hibernation, except if hibernation uses fuse
but that's a very special case.

>
> What about having a "freeze me after all of my children" flag that will be
> inherited from parents? Would that help the fuse case?


With kernel filesystems there's a clear distinction between tasks that
may originate filesystem requests (userspace processes) and those that
serve these requests (kernel threads). So its possible to freeze the
originators first and the servers afterwards.

With fuse there's no such difference. Even if it were known which
processes are servers and which are originators (it is not known in
the general case) then there would be a problem that some processes
are servers AND originators at the same time.

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/