Re: patch-2.0.1[67] kills automount daemon (Berkeley amd)

Andrew C. Esh (andrewes@cnt.com)
Fri, 6 Sep 1996 11:38:07 -0500


>>>>> "Systemkennung" == Systemkennung Linux <linux@informatik.uni-koblenz.de> writes:

>> > I'm going to release a 2.0.18 reasonably soon, but in the
>> meantime I'd like > to know if this small fix fixes the NFS
>> problems.. Now that the kernel > mailing list seems to work
>> well again, maybe we can use it for what it was > meant for.
>>
>> Master,
>>
>> it seems to work. I had no more hangs with amd, but I could not
>> deterministically force a hang before, therefore "seems". But
>> I've tried hard (with run-parts /etc/cron.daily, this used to
>> hang the automounter).

Systemkennung> A very reliable way to hang amd is to use elm for
Systemkennung> reading mail on a automounted /var/spool/mail/.
Systemkennung> Start elm, then let it sit around idle for some
Systemkennung> time. Amd then tried to unmount the fs and NFS
Systemkennung> hangs.

Here's another peice of the puzzle, from /var/log/messages:

Sep 5 02:47:04 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 02:47:04 andrewes kernel: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 02:47:04 andrewes kernel: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 08:00:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 08:00:02 andrewes kernel: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 08:15:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 08:15:02 andrewes kernel: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 08:30:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 08:30:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 08:45:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 08:45:03 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:00:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:00:02 andrewes kernel: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:15:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:15:02 andrewes kernel: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:30:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:30:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:45:02 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:45:02 andrewes kernel: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:56:09 andrewes login: 2 LOGIN FAILURES ON tty1, andrewes
Sep 5 09:56:17 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:56:18 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:56:22 andrewes kernel: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:56:29 andrewes su: andrewes on /dev/tty1
Sep 5 09:56:30 andrewes linux: nfs_stat_to_errno: bad nfs status return value: 116
Sep 5 09:57:20 andrewes init: Switching to runlevel: 6
Sep 5 09:57:20 andrewes getty[232]: exiting on TERM signal
Sep 5 09:57:20 andrewes getty[233]: exiting on TERM signal
Sep 5 09:57:20 andrewes getty[234]: exiting on TERM signal
Sep 5 09:57:20 andrewes getty[235]: exiting on TERM signal
Sep 5 10:09:25 andrewes syslogd: restart

The 116 failures starting at 8:00:02 were generated by a cron job that
copies a file to an NFS volume. The ones at 2:47am are a cron job that
runs updatedb. Notice my login failures, and then a reboot. I was
unable to log in due to the fact that my rc script checks my mail status.

Here's the same log for the next day, after removing my
troublesome mail directory automount:

Sep 6 02:47:08 andrewes linux: NFS server anoka not responding, timed out.
Sep 6 02:47:08 andrewes linux: nfs_fhget: getattr error = 5
Sep 6 02:47:08 andrewes linux: nfs_read_super: get root inode failed
Sep 6 02:47:17 andrewes linux: NFS server anoka not responding, timed out.
Sep 6 02:47:17 andrewes linux: nfs_fhget: getattr error = 5
Sep 6 02:47:17 andrewes linux: nfs_read_super: get root inode failed
Sep 6 02:47:38 andrewes linux: NFS server anoka not responding, timed out.
Sep 6 02:47:38 andrewes linux: nfs_fhget: getattr error = 5
Sep 6 02:47:38 andrewes linux: nfs_read_super: get root inode failed
Sep 6 08:32:55 andrewes init: Switching to runlevel: 6

This caused no problems, and updatedb was able to complete. No errors
were reported by the file copy cron job starting at 8:00. The reason I
rebooted at 8:32 was to load kernel 2.0.18, after having been running
2.0.17. Otherwise, amd was still working OK.

If it's any help, the NFS server is was connecting to is running SunOS
Release 4.1.2, and the nfsd is date Oct 23, 1991, and doesn't respond
to any command line arguments with any sort of version. Most other OS
binaries are dated about that same time.

--
Andrew C. Esh			mailto:andrew_esh@cnt.com
Computer Network Technology	andrewes@mtn.org (finger for PGP key)
6500 Wedgwood Road		612.550.8000 (main)
Maple Grove MN 55311		612.550.8229 (direct)
http://www.cnt.com - CNT Inc. Home Page
http://www.mtn.org/~andrewes - ACE Home Page