Re: [PATCH v2 1/2] fcntl: fix potential deadlocks for &fown_struct.lock

From: Desmond Cheong Zhi Xi
Date: Wed Jul 07 2021 - 02:54:25 EST


On 7/7/21 2:05 pm, Greg KH wrote:
On Wed, Jul 07, 2021 at 10:35:47AM +0800, Desmond Cheong Zhi Xi wrote:
Syzbot reports a potential deadlock in do_fcntl:

========================================================
WARNING: possible irq lock inversion dependency detected
5.12.0-syzkaller #0 Not tainted
--------------------------------------------------------
syz-executor132/8391 just changed the state of lock:
ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: f_getown_ex fs/fcntl.c:211 [inline]
ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: do_fcntl+0x8b4/0x1200 fs/fcntl.c:395
but this lock was taken by another, HARDIRQ-safe lock in the past:
(&dev->event_lock){-...}-{2:2}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
Chain exists of:
&dev->event_lock --> &new->fa_lock --> &f->f_owner.lock

Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&f->f_owner.lock);
local_irq_disable();
lock(&dev->event_lock);
lock(&new->fa_lock);
<Interrupt>
lock(&dev->event_lock);

*** DEADLOCK ***

This happens because there is a lock hierarchy of
&dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
from the following call chain:

input_inject_event():
spin_lock_irqsave(&dev->event_lock,...);
input_handle_event():
input_pass_values():
input_to_handler():
evdev_events():
evdev_pass_values():
spin_lock(&client->buffer_lock);
__pass_event():
kill_fasync():
kill_fasync_rcu():
read_lock(&fa->fa_lock);
send_sigio():
read_lock_irqsave(&fown->lock,...);

However, since &dev->event_lock is HARDIRQ-safe, interrupts have to be
disabled while grabbing &f->f_owner.lock, otherwise we invert the lock
hierarchy.

Hence, we replace calls to read_lock/read_unlock on &f->f_owner.lock,
with read_lock_irq/read_unlock_irq.

Here read_lock_irq/read_unlock_irq should be safe to use because the
functions f_getown_ex and f_getowner_uids are only called from
do_fcntl, and f_getown is only called from do_fnctl and
sock_ioctl. do_fnctl itself is only called from syscalls.

For sock_ioctl, the chain is
compat_sock_ioctl():
compat_sock_ioctl_trans():
sock_ioctl()

And interrupts are not disabled on either path. We assert this
assumption with WARN_ON_ONCE(irqs_disabled()). This check is also
inserted into another use of write_lock_irq in f_modown.

Reported-and-tested-by: syzbot+e6d5398a02c516ce5e70@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@xxxxxxxxx>
---
fs/fcntl.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index dfc72f15be7f..262235e02c4b 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -88,6 +88,7 @@ static int setfl(int fd, struct file * filp, unsigned long arg)
static void f_modown(struct file *filp, struct pid *pid, enum pid_type type,
int force)
{
+ WARN_ON_ONCE(irqs_disabled());

If this triggers, you just rebooted the box :(

Please never do this, either properly handle the problem and return an
error, or do not check for this. It is not any type of "fix" at all,
and at most, a debugging aid while you work on the root problem.

thanks,

greg k-h


Hi Greg,

Thanks for the feedback. My bad, I was under the impression that WARN_ON_ONCE could be used to document assumptions for other developers, but I'll stick to using it for debugging in the future.

I think then in this case it would be best to keep the reasoning for why the *_irq() locks are safe to use in the commit message. I'll update the patch accordingly.

Best wishes,
Desmond