Re: [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce

From: Zihuan Zhang
Date: Mon Jul 21 2025 - 06:42:04 EST



在 2025/7/17 17:50, Rafael J. Wysocki 写道:
On Thu, Jul 17, 2025 at 3:02 AM Zihuan Zhang <zhangzihuan@xxxxxxxxxx> wrote:
HI Rafael,

在 2025/7/16 20:26, Rafael J. Wysocki 写道:
Hi,

On Wed, Jul 16, 2025 at 8:26 AM Zihuan Zhang <zhangzihuan@xxxxxxxxxx> wrote:
Hi all,

This patch series improves the performance of the process freezer by
skipping zombie tasks during freezing.

In the suspend and hibernation paths, the freezer traverses all tasks
and attempts to freeze them. However, zombie tasks (EXIT_ZOMBIE with
PF_EXITING) are already dead — they are not schedulable and cannot enter
the refrigerator. Attempting to freeze such tasks is redundant and
unnecessarily increases freezing time.

In particular, on systems under fork storm conditions (e.g., many
short-lived processes quickly becoming zombies), the number of zombie tasks
can spike into the thousands or more. We observed that this causes the
freezer loop to waste significant time processing tasks that are guaranteed
to not need freezing.
I think that the discussion with Peter regarding this has not been concluded.

I thought that there was an alternative patch proposed during that
discussion. If I'm not mistaken about this, what happened to that
patch?

Thanks!

Currently, the general consensus from the discussion is that skipping
zombie or dead tasks can help reduce locking overhead during freezing.
Peter doesn't seem to be convinced that this is the case.


Yeah.

The remaining question is how best to implement that.

Peter suggested skipping all tasks with PF_NOFREEZE, which would make
the logic more general and cover all cases. However, as Oleg pointed
out, the current implementation based on PF_NOFREEZE might be problematic.

My current thought is that exit_state already reliably covers all
exiting user processes, and it’s a good fit for skipping user-space
tasks. For the kernel side, we may safely skip a few kernel threads like
kthreadd that set PF_NOFREEZE and never change it — we can consider
refining this further in the future.
There is the counter argument of special-casing of p->exit_state and
the relatively weak justification for it.

You have created a synthetic workload where it matters, but how likely
is it to be the case in practice?


Our initial thought was that the freezer should primarily focus on tasks that can be frozen. If a task is not freezable and its state will not change (such as kernel threads that have PF_NOFREEZE set permanently),

 it should be safe to skip it during the iteration. This helps to reduce unnecessary overhead when handling a large number of such tasks.

We do not insist that this is the only correct way to implement the optimization — if there’s a better approach that is equally safe and more general, we are happy to adopt it.

In practice, the improvement becomes noticeable only when there are a lot of tasks present. So the benefit is scenario-dependent, and we agree that real-world relevance should be considered carefully.

Thanks again for the discussion.