Re: Please try knfsd 980922

Tad Kollar (tkollar@anduin.lerc.nasa.gov)
Thu, 24 Sep 1998 15:09:39 -0400 (EDT)


> > First let me say that the exports improvements are extremely welcome....
> >
> > Unfortunately I'm seeing stability problems using a large number of client
> > systems with knfs-980922. This is a cluster with 40 diskless machines... after
> > about 17 of them boot, the nfs server stops responding. Each client needs
> > two mounts. Originally I was running with 5 nfsds which I increased to 16
> > but it didn't help.
> >
> > The server kernel is 2.1.122ac2 + knfsd-980922 (applied both patches). I
> > have nfsd compiled into the kernel rather than as a module. Clients are
> > running 2.0.35 and 2.1.117. I didn't see the problem with knfs-0.4.22 +
> > 2.1.11x or knfs-980915 + 2.1.122 (didn't try 980920).
>
> Please try 2.1.122 with knfsd-980922. There are no new kernel
> patches from knfs-980915 to knfsd-980922. If it is the kernel NFS
> server, not mountd, that stops responding, I don't think it is
> knfsd-980922 who causes it.

I tried it with plain 2.1.122 + 980922 patches but it still didn't work. So
I doubted my assessment of 980915 working as well and tried 2.1.122 +
980915... it breaks in the same way. When I tried it before I just rebooted
the server without rebooting the clients, which is why I didn't see
the problem.

The point at which it freezes up is when the 20th or so client requests a
mount. However, that event never gets entered into the server's logs. To
fix it, I kill the daemons, turn off the newly booted clients, and restart
the daemons. If I don't turn off those machines first it won't respond when
it comes back up. Sometimes I won't turn off enough and the server will
come up for a short time but stop responding a few minutes later.

I'll try 2.1.122 + knfs 0.4.22 next to see if maybe its actually a 2.1.122
networking thing...

-Tad

-- 
-------------------------------------------------------------------------
Thaddeus J. Kollar                             | 
Sterling Software, Scientific Systems Division | "Evildoers, eat my
NASA Lewis Research Center                     |  justice!!!" - The Tick
Tel: 216-433-5105  Fax: 216-433-8000           |
-------------------------------------------------------------------------

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/