Unmount root patch, version 1 (16-JAN-2000) ----------------------------- Patch for kernel 2.3.38. What does it change ? --------------------- - /proc/mounts shows mount points relative to the root of the current process - umount unmounts the block device if it is "busy" but all references are mount points (in particular, root and pwd of all processes are on other file systems) Changes since version 0 ----------------------- - fixed race condition in d_umount. Can somebody please check if moving sb->s_root = NULL; towards the end of d_umount was okay ? - /proc/mounts now prefixes unreachable path names with the device on which the path name begins, e.g. 03:01/foo/bar - made check for root device in do_umount independent from (possibly invalid) ROOT_DEV Example ------- My / is on /dev/hda1. /var is on a separate partition /dev/hda5. 1) apply the unmount-root patch and build the kernel 2) make sure there's a chroot in /sbin: [ -x /sbin/chroot ] || cp /usr/sbin/chroot /sbin 3) create a minimum system environment on a non-root partition, e.g. mkdir /var/test (cd / && tar cfl - bin lib sbin etc dev) | (cd /var/test && tar xf -) 4) reboot with init=/bin/sh 5) make the test environment your new root (assume /var has its own partition): mount /var exec /sbin/chroot /var/test /bin/sh \ /var/test/dev/console 2>&1 6) get rid of the old root (assume it's on /dev/hda1): umount /dev/hda1 7) for kicks, return to the old root FS (assume /var is on /dev/hda5): mkdir /mnt mount /dev/hda1 /mnt exec /sbin/chroot /mnt /bin/sh /mnt/dev/console 2>&1 umount /dev/hda5 8) bring up the system: mount -o remount,ro / exec /sbin/init Known problems and anomalies ---------------------------- - mount points in /proc/mounts may be outside of the directory tree underneath the current process' root. In this case, the path name is prefixed with the device number on which the path begins. This may confuse programs reading /proc/mounts - if multiple file systems were mounted on old root, each of them becomes a potential root. This can be considered to be a feature ;-) - if anonymous block devices are detached this way, it can be difficult to unmount them (work-around: unmout everything that's not needed while the old root is still accessible) - umount is overloaded. Therefore, if the umount fails for some reason, it will still appear to succeed, but actually just set the old root read-only (do we need a new mount option ?) - the content of ROOT_DEV is meaningless after a root directory change - disk quota support also uses mnt_dirname and still prints the original path names (see below) - changing the root changes /dev/root to the real device, which confuses the RH 6.1 halt script (which explicitly checks for /dev/root), causing it to hang. Discussion of problems ---------------------- The main problem is certainly tmp_dirname, which shows up in /proc/mounts and in disk quota messages. The semantics of a mount point's path name are inherently ambiguous from the perspective of a process that uses a different root than the one under which the mount point was made. This ambiguity is aggravated if the mount point is not below the process' root. Things get even worse if already the normal system environment is chrooted (which this extension encourages to do). One problem of using d_path to construct path names with respect to the current root is that mount points outside the current root yield paths that are not only meaningless, but that may also change as a side-effect of mount operations. The current solution of constructing as much of the path as possible and then adding the device number as a prefix is unambiguous but changes the syntax of /proc/mounts. Therefore, this needs input from people with programs using /proc/mounts. The second major problem are "lost" unnamed block devices. In principle, all such problems are caused by user errors. Considering that interactive use of this facility will be rare, we may just decide to live with this problem. Furthermore, recovery is possible by trying to manually construct the right device numbers. A clean solution would involve exposing the minor numbers assigned by the kernel. E.g. /proc/mounts could use kdevname instead of mnt_devname in such cases. Again, the feasibility of this depends on what users of /proc/mounts expect. Concerning the overloading of umount : since umount(8) will try to translate the device name to the mount point, the new behaviour is only selected if the root FS is already inaccessible. Therefore, the only ambiguity occurs when the new behaviour is desired, but due to some mistake the umount fails and (if the file system happens to be the old root) is silently turned into a mount -o remount,ro. A umount option to specificly indicate that we intend to unmount an inaccessible file system may be cleaner, but I'm not sure if it's worth the effort.