I've seen this too, on RedHat 6.0. About the following sequence of actions
occurred:
- shutdown
- tried to umount NFS. Failed because a process was running off it (cwd
and executable on NFS, possibly also some open files)
- network was shut down
- something made the system want to use NFS (*)
- a looong sequence of 101s
- me timing out, hitting reset
(*) possible causes: page-in, process writes good-bye message, 2nd
umount attempt, etc. Didn't look at what exactly caused it.
This is somewhat tricky to fix: the init scripts should have killed the
process sitting on NFS before trying to umount. However, killing all
processes before umount may kill a process that's keeping the network
alive (e.g. pppd, atmarpd).
Finding out what can be killed and what can't, may be tricky, because
there may be ordering constraints. Example: three NFS mounts: /ether,
/ppp, /atm, each using the corresponding network. The demons are run
as follows: /ether/pppd and /ppp/atmarpd. There be also an /ether/food
which doesn't do anything network-related.
Now we could kill them in this order: food, atmarpd, pppd. But we
may not be able to kill them in this order: food, pppd, atmarpd.
(Note: this still kills the "well-known network demons" after the
other things, so just maintaining a list of them doesn't help.)
A practical solution may be simply to forcibly unmount all stuck
file systems. In any case it's better than forcing the user to press
the reset button.
- Werner
-- _________________________________________________________________________ / Werner Almesberger, ICA, EPFL, CH werner.almesberger@ica.epfl.ch / /_IN_R_131__Tel_+41_21_693_6621__Fax_+41_21_693_6610_____________________/- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/