Re: Apparent serious progressive ext4 data corruption bug in 3.6 (when rebooting during umount)

From: Nix
Date: Wed Oct 24 2012 - 19:42:46 EST


On 25 Oct 2012, nix@xxxxxxxxxxxxx said:
> Even though my own system relies on the possibility of rebooting during
> umount to reboot reliably, I'd be inclined to say 'not a bug, don't do
> that then' -- except that this renders it unreliable to use umount -l to
> unmount all the filesystems you can, skipping those that are not
> reachable due to having unresponsive servers in the way.

It's worse than that. If you're using filesystem namespaces, how can
*any* shell script loop, or anything in userspace, reliably unmount all
filesystems before reboot? It seems to me this is impossible. There is
no process that necessarily has access to all namespaces, and when you
bring PID namespaces into the picture there is no process that can even
kill all userspace processes in order to zap their filesystems.

I suspect we need a new blocking 'umountall' syscall and a command that
calls it, which umounts everything it can in every filesystem namespace
it can, skipping those that are (unreachable?) network mounts, and
returns only when everything is done. (Possibly it should first kill
every process it sees in every PID namespace other than that of the
caller, too.)

Then shutdown scripts can just call this, and get the right behaviour
immediately.

--
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/