Re: [PATCH 1/1] cpusets/sched_domain reconciliation

From: Andrew Morton
Date: Thu Mar 22 2007 - 18:32:59 EST


On Thu, 22 Mar 2007 16:39:45 -0600
Cliff Wickman <cpw@xxxxxxx> wrote:

> Hello Andrew,
>
> On Thu, Mar 22, 2007 at 02:21:52PM -0700, Andrew Morton wrote:
> > On Tue, 20 Mar 2007 13:14:35 -0600
> > cpw@xxxxxxx (Cliff Wickman) wrote:
> >
> > > This patch reconciles cpusets and sched_domains that get out of sync
> > > due to disabling and re-enabling of cpu's.
> >
> > I get three-out-of-three rejects in cpuset.c. I could fix them, but I
> > wouldn't be very confident that the result works at runtime. 2.6.20-rc6 was
> > a long time ago - please, always raise patches against the latest mainline
> > kernel (the daily git snapshot suffices).
>
> Will do.
>
> > Recursion is a big no-no in kernel. Is there any way in which it can be
> > avoided? Is Dinakar's implementation also recursive?
>
> I was a little reluctant to use recursion, but this use parallels another,
> existing such use in cpuset.c The depth of the recursion is only the depth of
> the cpuset hierarchy, which is set up by an administrator, and which is
> logically limited by the number of cpus in the system. e.g. it would be
> hard to even deliberately organize 16 cpus into a hierarchy greater
> than 16 layers deep, even if you wanted cpusets of single cpus.
> We've not run into such a problem on systems of hundreds of cpus.
> I would think it's safe. What do you think?

It isn't very nice. It probably won't crash, but it _can_ crash and when
it does, the results will be very obscure and people who will be affected
by the crash will be badly $impacted$ by it.

Perhaps as a middle-ground thing we could simply ban the creation of
cpusets hierarchies which are more than <mumble> layers deep.

Or, worse, we could take a peek at the depth of the tree before starting
the recursion, then just fail out if it exceeds <mumble>

Or, worse still, we could allow the recursion to proceed down <mumble>
levels, then refuse to apply the reconciliation any deeper.

Best would be to avoid the recursion ;) lib/radix-tree.c has a similar
problem, and has a possibly-conceptually-applicable solution. It has a
fixed maximum depth so it uses a local array, but it could use kmalloc()
for the radix_tree_path.

Is there any sane way in which we can perform the recursion in userspace?
Get the application to walk the hierarchy and do the fixups at each node?
Probably not...

> Dinakar's solution is not written yet, as far as I know. I'll copy him
> for his status.

Good idea.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/