Unmount root patch, version 1 (16-JAN-2000)
-----------------------------

Patch for kernel 2.3.38.


What does it change ?
---------------------

- /proc/mounts shows mount points relative to the root of the current process
- umount <block_device>  unmounts the block device if it is "busy" but all
  references are mount points (in particular, root and pwd of all processes
  are on other file systems)


Changes since version 0
-----------------------

- fixed race condition in d_umount. Can somebody please check if moving
  sb->s_root = NULL;
  towards the end of d_umount was okay ?
- /proc/mounts now prefixes unreachable path names with the device on which
  the path name begins, e.g. 03:01/foo/bar
- made check for root device in do_umount independent from (possibly invalid)
  ROOT_DEV


Example
-------

My / is on /dev/hda1. /var is on a separate partition /dev/hda5.

1) apply the unmount-root patch and build the kernel
2) make sure there's a chroot in /sbin:
   [ -x /sbin/chroot ] || cp /usr/sbin/chroot /sbin
3) create a minimum system environment on a non-root partition, e.g.
   mkdir /var/test
   (cd / && tar cfl - bin lib sbin etc dev) | (cd /var/test && tar xf -)
4) reboot with init=/bin/sh
5) make the test environment your new root (assume /var has its own partition):
   mount /var
   exec /sbin/chroot /var/test /bin/sh \
     </var/test/dev/console >/var/test/dev/console 2>&1
6) get rid of the old root (assume it's on /dev/hda1):
   umount /dev/hda1
7) for kicks, return to the old root FS (assume /var is on /dev/hda5):
   mkdir /mnt
   mount /dev/hda1 /mnt
   exec /sbin/chroot /mnt /bin/sh </mnt/dev/console >/mnt/dev/console 2>&1
   umount /dev/hda5
8) bring up the system:
   mount -o remount,ro /
   exec /sbin/init


Known problems and anomalies
----------------------------

- mount points in /proc/mounts may be outside of the directory tree underneath
  the current process' root. In this case, the path name is prefixed with the
  device number on which the path begins. This may confuse programs reading
  /proc/mounts
- if multiple file systems were mounted on old root, each of them becomes a
  potential root. This can be considered to be a feature ;-)
- if anonymous block devices are detached this way, it can be difficult to
  unmount them (work-around: unmout everything that's not needed while the old
  root is still accessible)
- umount <old_root_dev>  is overloaded. Therefore, if the umount fails for some
  reason, it will still appear to succeed, but actually just set the old root
  read-only (do we need a new mount option ?)
- the content of ROOT_DEV is meaningless after a root directory change
- disk quota support also uses mnt_dirname and still prints the original path
  names (see below)
- changing the root changes /dev/root to the real device, which confuses the
  RH 6.1 halt script (which explicitly checks for /dev/root), causing it to
  hang.


Discussion of problems
----------------------

The main problem is certainly tmp_dirname, which shows up in /proc/mounts and
in disk quota messages. The semantics of a mount point's path name are
inherently ambiguous from the perspective of a process that uses a different
root than the one under which the mount point was made. This ambiguity is
aggravated if the mount point is not below the process' root. Things get even
worse if already the normal system environment is chrooted (which this
extension encourages to do).

One problem of using d_path to construct path names with respect to the
current root is that mount points outside the current root yield paths that
are not only meaningless, but that may also change as a side-effect of mount
operations.

The current solution of constructing as much of the path as possible and
then adding the device number as a prefix is unambiguous but changes the
syntax of /proc/mounts. Therefore, this needs input from people with
programs using /proc/mounts.

The second major problem are "lost" unnamed block devices. In principle,
all such problems are caused by user errors. Considering that interactive
use of this facility will be rare, we may just decide to live with this
problem. Furthermore, recovery is possible by trying to manually construct
the right device numbers. A clean solution would involve exposing the
minor numbers assigned by the kernel. E.g. /proc/mounts could use
kdevname instead of mnt_devname in such cases. Again, the feasibility of
this depends on what users of /proc/mounts expect.

Concerning the overloading of umount <root_dev>: since umount(8) will try to
translate the device name to the mount point, the new behaviour is only
selected if the root FS is already inaccessible. Therefore, the only
ambiguity occurs when the new behaviour is desired, but due to some
mistake the umount fails and (if the file system happens to be the old
root) is silently turned into a  mount -o remount,ro. A umount option to
specificly indicate that we intend to unmount an inaccessible file system
may be cleaner, but I'm not sure if it's worth the effort.