Re: NFS: more problems under 2.1.117

Alan Cox (alan@lxorguk.ukuu.org.uk)
Fri, 21 Aug 1998 23:41:59 +0100 (BST)


> >writing a backup file, announces it cannot write the file, the
> >promptly rewrites it as the ubiquitous %backup%~. Could this be some
> >race involving file locking over NFS?
>
> No, it turns out that Alan turned on interruptible NFS mounts in
> 2.1.117, and that is (a) wrong and (b) won't work.

The code I turned on is (a) valid (b) required for things like mount to work
properly. Read it before mouthing off.

If that daemon gets a signal on a hard mount its a signal 9. If its not
a signal 9 then the signal masking code is broken either in the kernel
core or in the NFS code. The points in question in the code have
all signals but SIGKILL blocked.

Also Linus if you read the source I was very careful how I did things. The
page wait code does _not_ yet enable the signal handling support. Its only
enabled in the synchronous rpc wait code. And for that to fail then quite
clearly either its logic in signal masking is wrong.

If you dont allow interruptions on NFS as they should be done then you consign
2.2 to the trash bin of history for servers.

Until those changes are put back Linux NFS 2.2 has no maintainer. You can
keep your crashes on lock daemons, your random corruptions when you truncate
and append to NFS files and the completely broken write behind scheme which
ensures that if your disk fills applications will fail to report the errors
and you'll lose data. The UDMA stuff you worried about is NOTHING on the
current nfs bugs.

There are two ways now to fix NFS. Firstly is to find why/if the signal
delivery is the issue and how a signal was delivered that was blocked.
Secondly is to rm -rf the entire 2.1 nfs catastrophe and put the 2.0 code
back. The 2.0 code is faster, smaller, cleaner and less buggy.

Reverting to a known fucked up state to avoid fixing a real bug is not an
option.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html