Re: Serious locking bug in Linux NFS

Ben 'The Con Man' Kahn (xkahn@cybersites.com)
Tue, 20 Oct 1998 16:26:40 -0400 (EDT)


Oops. I sent this out, but no one got it. :^)

-Ben

On Fri, 16 Oct 1998, H.J. Lu wrote:

> > > Here is a kernel patch. Please make sure lockd is runing on your client
> > > machine. You should start "portmap" before mounting NFS.
> >
> > Okay. So lockd isn't running the clients. (which are all 2.0
> > kernels for the Linux boxen) Am I to understand that the clients need to
> > be running 2.1 kernels?? Is there a patch? This is fairly important.
>
> The lockd is a kernel process. You need to run 2.1 kernel
> to get it.

Alright. I haven't applied the patches yet. (It's been one of
those weeks... I don't want to talk about it.) Okay. So I have a Linux
server running statd, knfsd, mountd, and lockd. I have an SGI running
IRIX 6.5 (64 bit) as the nfs client. Hey! Stop that! No groaning! :^)
This SGI now has lockd!

Okay. So when I run my test program (see earlier emails if you
need a refresher on my test program) I get this output:

Opening 'testlock' for writing.
Success!

Trying to get a read lock for 'testlock'
Unable to get lock for testlock

Hmm... That's what it said about locking BEFORE I got lockd....
But /var/log/messages has more information:

Oct 19 18:41:22 nero kernel: lockd: bad cookie size 8 (should be 4)
Oct 19 18:41:22 nero kernel: svc: failed to decode args
Oct 19 18:41:22 nero kernel: lockd: bad cookie size 8 (should be 4)
Oct 19 18:41:22 nero kernel: svc: failed to decode args

It appears that IRIX 64 uses 8 (byte?) cookies or 64 bits. (It
does 64-bit I/O.) According to the manual page, it might be safe to
ignore the top 32 bits:

32bitclients
Causes the server to mask off the high order 32 bits of
directory cookies in NFS version 3 directory operations. This
option may be required when clients run 32-bit operating
systems that assume the entire cookie is contained in 32 bits
and reject responses containing version 3 cookies with high
bits on. IRIX 5.3 and Solaris 2.5 are examples of 32-bit
operating systems with this behavior, which produces error
messages like "Cannot decode response" on directory operations.
XFS filesystems on the server can generate cookies with high
bits on. Exporting filesystems with the 32bitclients option
causes these bits to be masked and prevents error messages.

However, this page is for the irix nfs SERVER, and I have not
found an associated option for the client. Although I've requested that
the client connect with nfs v2. Here's the options I pass to mount the
linux box on the sgi:

mount -o vers=2,rw,dev=0001 linuxserver:... /...

The code which should be modified (if I'm right) is:

fs/lockd/xdr.c:
---------------
/*
* XDR functions for basic NLM types
*/
static inline u32 *
nlm_decode_cookie(u32 *p, u32 *c)
{
unsigned int len;

if ((len = ntohl(*p++)) == 4) {
*c = ntohl(*p++);
} else if (len == 0) { /* hockeypux brain damage */
*c = 0;
} else {
printk(KERN_NOTICE
"lockd: bad cookie size %d (should be 4)\n", len);
return NULL;
}
return p;
}

Although it could be that nfsv3 is being used for some reason.
Ideas?
-Ben

------------------------------------ |\ _,,,--,,_ ,) ----------
Benjamin Kahn /,`.-'`' -, ;-;;'
(212) 924 - 2220 |,4- ) )-,_ ) /\
ben@cybersites.com --------------- '---''(_/--' (_/-' ---------------
Meet Linux: Forrest Gump as an operating system.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/