Re: [RFC PATCH RESEND 1/1] fs/namespace: defer free_mount from namespace_unlock

From: Al Viro
Date: Thu Jan 19 2023 - 17:25:09 EST


On Thu, Jan 19, 2023 at 04:14:55PM -0500, Eric Chanudet wrote:
> From: Alexander Larsson <alexl@xxxxxxxxxx>
>
> Use call_rcu to defer releasing the umount'ed or detached filesystem
> when calling namepsace_unlock().
>
> Calling synchronize_rcu_expedited() has a significant cost on RT kernel
> that default to rcupdate.rcu_normal_after_boot=1.
>
> For example, on a 6.2-rt1 kernel:
> perf stat -r 10 --null --pre 'mount -t tmpfs tmpfs mnt' -- umount mnt
> 0.07464 +- 0.00396 seconds time elapsed ( +- 5.31% )
>
> With this change applied:
> perf stat -r 10 --null --pre 'mount -t tmpfs tmpfs mnt' -- umount mnt
> 0.00162604 +- 0.00000637 seconds time elapsed ( +- 0.39% )
>
> Waiting for the grace period before completing the syscall does not seem
> mandatory. The struct mount umount'ed are queued up for release in a
> separate list and no longer accessible to following syscalls.

Again, NAK. If a filesystem is expected to be shut down by umount(2),
userland expects it to have been already shut down by the time the
syscall returns.

It's not just visibility in namespace; it's "can I pull the disk out?".
Or "can the shutdown get to taking the network down?", for that matter.