Re: [PATCH 1/2] oom: do not live lock on frozen tasks

From: Michal Hocko
Date: Mon Sep 26 2011 - 07:06:05 EST


On Mon 26-09-11 19:58:50, Rusty Russell wrote:
> On Mon, 26 Sep 2011 10:28:37 +0200, Michal Hocko <mhocko@xxxxxxx> wrote:
> > On Fri 26-08-11 11:13:40, David Rientjes wrote:
> > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer
> > > followed by a SIGKILL if that task is selected for oom kill without an
> > > heuristic change. Not sure if that's possible, so we'll wait for Rafael
> > > to chime in.
> >
> > We have discussed that with Rafael and it should be safe to do that. See
> > the patch bellow.
> > The only place I am not entirely sure about is run_guest
> > (drivers/lguest/core.c). It seems that the code is able to cope with
> > signals but it also calls lguest_arch_run_guest after try_to_freeze.
>
> Yes; if you want to kill things in the refrigerator(), then will a
>
> if (cpu->lg->dead || task_is_dead(current))
> break;
>
> Work?

The task is not dead yet. We should rather check for pending signals.
Can we just move try_to_freeze up before the pending signals check?

diff --git a/drivers/lguest/core.c b/drivers/lguest/core.c
index 2535933..a513509 100644
--- a/drivers/lguest/core.c
+++ b/drivers/lguest/core.c
@@ -232,6 +232,12 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user)
}
}

+ /*
+ * All long-lived kernel loops need to check with this horrible
+ * thing called the freezer. If the Host is trying to suspend,
+ * it stops us.
+ */
+ try_to_freeze();
/* Check for signals */
if (signal_pending(current))
return -ERESTARTSYS;
@@ -246,13 +252,6 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user)
try_deliver_interrupt(cpu, irq, more);

/*
- * All long-lived kernel loops need to check with this horrible
- * thing called the freezer. If the Host is trying to suspend,
- * it stops us.
- */
- try_to_freeze();
-
- /*
* Just make absolutely sure the Guest is still alive. One of
* those hypercalls could have been fatal, for example.
*/

> That break means we return to the read() syscall pretty much
> immediately.
>
> Thanks for the CC,
> Rusty.

--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/