Re: Linux scheduler capabilities for batch jobs.

From: Mike Galbraith
Date: Tue Jun 02 2009 - 00:58:01 EST


On Mon, 2009-06-01 at 09:41 -0400, J Louis wrote:

> My problem is analogous to a parallel make. Say I have an 8 CPU
> machine, and I run "make -j8". If the total memory of the 8 jobs
> throws the machine into swap, it begins to thrash and runtime is
> awful.

Thrashing is more of a VM/IO scheduling concern.

> I believe this is aggravated by the scheduler trying to be
> fair, and keeping all 8 processes running.

Yup, fair CPU distribution is the process scheduler's mission, and that
allows tasks to compete for other resources.

> If it was possible to tell
> the scheduler that it was OK not to be fair when scheduling these
> processes, I think the total runtime could be reduced if it put some
> of the processes to sleep while others completed.

The scheduler doesn't know that any given task _ever_ completes.

> Is there a way to
> tell the scheduler it is allowed to do this? Should there be?

No, and I don't think it's feasible for existing classes. You could
invent a new scheduling class, but I think you'd need to invent quite a
bit of infrastructure in the VM to make it work well.

OTOH, the process scheduler doesn't, and shouldn't, make IO resource
decisions, we have IO schedulers to manage who gets what IO bandwidth
when. The same should apply to VM resources. Seems to me what you
really want is a VM scheduler.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/