Re: [PATCH] fs: export kern_path_locked

From: Denis Kirjanov
Date: Tue Feb 16 2021 - 09:32:43 EST


On 2/14/21, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Jan 29, 2021 at 01:18:55PM +0000, Christoph Hellwig wrote:
>> On Fri, Jan 29, 2021 at 04:11:05PM +0300, Denis Kirjanov wrote:
>> > Do you mean just:
>>
>> We'll still need to lock the parent inode.
>
> Not just "lock", we wouldd need to have the lock _held_ across the
> entire sequence. Without that there's no warranty that it will refer
> to the same object we'd created.
>
> In any case, unlink in any potentially public area is pretty much
> never the right approach. Once mknod has happened, that's it - too
> late to bail out.
>
> IIRC, most of the PITA in that area is due to unix_autobind()
> iteractions. Basically, we try to bind() an unbound socket and
> another thread does sendmsg() on the same while we are in the
> middle of ->mknod(). Who should wait for whom?
>
> ->mknod() really should be a point of no return - any games with
> "so we unlink it" are unreliable in the best case, and that's
> only if we do _not_ unlock the parent through the entire sequence.
>
> Seeing that we have separate bindlock and iolock now... How about
> this (completely untested) delta?

We had a change like that:
Author: WANG Cong <xiyou.wangcong@xxxxxxxxx>
Date: Mon Jan 23 11:17:35 2017 -0800

af_unix: move unix_mknod() out of bindlock

Dmitry reported a deadlock scenario:

unix_bind() path:
u->bindlock ==> sb_writer

do_splice() path:
sb_writer ==> pipe->mutex ==> u->bindlock

In the unix_bind() code path, unix_mknod() does not have to
be done with u->bindlock held, since it is a pure fs operation,
so we can just move unix_mknod() out.


>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 41c3303c3357..c21038b15836 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1034,6 +1034,14 @@ static int unix_bind(struct socket *sock, struct
> sockaddr *uaddr, int addr_len)
> goto out;
> addr_len = err;
>
> + err = mutex_lock_interruptible(&u->bindlock);
> + if (err)
> + goto out;
> +
> + err = -EINVAL;
> + if (u->addr)
> + goto out_up;
> +
> if (sun_path[0]) {
> umode_t mode = S_IFSOCK |
> (SOCK_INODE(sock)->i_mode & ~current_umask());
> @@ -1041,18 +1049,10 @@ static int unix_bind(struct socket *sock, struct
> sockaddr *uaddr, int addr_len)
> if (err) {
> if (err == -EEXIST)
> err = -EADDRINUSE;
> - goto out;
> + goto out_up;
> }
> }
>
> - err = mutex_lock_interruptible(&u->bindlock);
> - if (err)
> - goto out_put;
> -
> - err = -EINVAL;
> - if (u->addr)
> - goto out_up;
> -
> err = -ENOMEM;
> addr = kmalloc(sizeof(*addr)+addr_len, GFP_KERNEL);
> if (!addr)
> @@ -1090,7 +1090,6 @@ static int unix_bind(struct socket *sock, struct
> sockaddr *uaddr, int addr_len)
> spin_unlock(&unix_table_lock);
> out_up:
> mutex_unlock(&u->bindlock);
> -out_put:
> if (err)
> path_put(&path);
> out:
>