Re: [ANNOUNCE] Minneapolis Cluster Summit, July 29-30

From: Daniel Phillips
Date: Sun Jul 11 2004 - 14:45:35 EST


On Saturday 10 July 2004 19:24, Steven Dake wrote:
> On Sat, 2004-07-10 at 13:57, Daniel Phillips wrote:
> > On Saturday 10 July 2004 13:59, Steven Dake wrote:
> > > overload conditions that have caused the kernel to run low on memory
> > > are a difficult problem, even for kernel components...
> > > ...I hope that helps atleast answer that some r&d is underway to solve
> > > this particular overload problem in userspace.
> >
> > I'm certain there's a solution, but until it is demonstrated and proved,
> > any userspace cluster services must be regarded with narrow squinty
> > eyes.
>
> I agree that a solution must be demonstrated and proved.
>
> There is another option, which I regularly recommend to anyone that
> must deal with memory overload conditions. Don't size the applications
> in such a way as to ever cause memory overload.

That, and "just add more memory" are the two common mistakes people make when
thinking about this problem. The kernel _normally_ runs near the low-memory
barrier, on the theory that caching as much as possible is a good thing.

Unless you can prove that your userspace approach never deadlocks, the other
questions don't even move the needle. I am sure that one day somebody, maybe
you, will demonstrate a userspace approach that is provably correct. Until
then, if you want your cluster to stay up and fail over properly, there's
only one game in town.

We need to worry about ensuring that no API _depends_ on the cluster manager
being in-kernel, and we also need to seek out and excise any parts that could
possibly be moved out to user space without enabling the deadlock or grossly
messing up the kernel code.

> > > I'd invite you, or others interested in these sorts of services, to
> > > contribute that code, if interested.
> >
> > Humble suggestion: try grabbing the Red Hat (Sistina) DLM code and see
> > if you can hack it to do what you want. Just write a kernel module
> > that exports the DLM interface to userspace in the desired form.
> >
> > http://sources.redhat.com/cluster/dlm/
>
> I would rather avoid non-mainline kernel dependencies at this time as it
> makes adoption difficult until kernel patches are merged into upstream
> code. Who wants to patch their kernel to try out some APIs?

Everybody working on clusters. It's a fact of life that you have to apply
patches to run cluster filesystems right now. Production will be a different
story, but (except for the stable GFS code on 2.4) nobody is close to that.

> I am doubtful these sort of kernel patches will be merged without a strong
> argument of why it absolutely must be implemented in the kernel vs all
> of the counter arguments against a kernel implementation.

True. Do you agree that the PF_MEMALLOC argument is a strong one?

> There is one more advantage to group messaging and distributed locking
> implemented within the kernel, that I hadn't originally considered; it
> sure is sexy.

I don't think it's sexy, I think it's ugly, to tell the truth. I am actively
researching how to move the slow-path cluster infrastructure out of kernel,
and I would be pleased to work together with anyone else who is interested in
this nasty problem.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/