Re: [PATCH] fs: export kern_path_locked

From: Al Viro
Date: Tue Feb 16 2021 - 13:01:59 EST


On Tue, Feb 16, 2021 at 05:31:33PM +0300, Denis Kirjanov wrote:

> We had a change like that:
> Author: WANG Cong <xiyou.wangcong@xxxxxxxxx>
> Date: Mon Jan 23 11:17:35 2017 -0800
>
> af_unix: move unix_mknod() out of bindlock
>
> Dmitry reported a deadlock scenario:
>
> unix_bind() path:
> u->bindlock ==> sb_writer
>
> do_splice() path:
> sb_writer ==> pipe->mutex ==> u->bindlock
>
> In the unix_bind() code path, unix_mknod() does not have to
> be done with u->bindlock held, since it is a pure fs operation,
> so we can just move unix_mknod() out.

*cringe*

I remember now... Process set:

P1: bind() of AF_UNIX socket to /mnt/sock
P2: splice() from pipe to /mnt/foo
P3: freeze /mnt
P4: splice() from pipe to AF_UNIX socket

P1 grabs ->bindlock
P2 sb_start_write() for what's on /mnt
P2 grabs rwsem shared
P3 blocks in sb_wait_write() trying to grab the same rwsem exclusive
P1 sb_start_write() blocks trying to grab the same rwsem shared
P4 calls ->splice_write(), aka generic_splice_sendpage()
P4 grabs pipe->mutex
P4 calls ->sendpage(), aka sock_no_sendpage()
P4 calls ->sendmsg(), aka unix_dgram_sendmsg()
P4 calls unix_autobind()
P4 blocks trying to grab ->bindlock
P2 ->splice_write(), aka iter_file_splice_write()
P2 blocks trying to grab pipe->mutex
DEADLOCK

Sigh... OK, so we want something like
user_path_create()
vfs_mknod()
created = true
grab bindlock
....
drop bindlock
if failed && created
vfs_unlink()
done_path_create()
in unix_bind()... That would push ->bindlock all way down in the hierarchy,
so that should be deadlock-free, but it looks like that'll be fucking ugly ;-/

Let me try and play with that a bit, maybe it can be massaged to something
relatively sane...