Re: possible deadlock in sidtab_sid2str_put

From: Ondrej Mosnacek
Date: Tue Jan 28 2020 - 11:31:17 EST


On Tue, Jan 28, 2020 at 4:45 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> On Tue, Jan 28, 2020 at 9:27 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
> > On Tue, Jan 28, 2020 at 2:44 PM Stephen Smalley <sds@xxxxxxxxxxxxx> wrote:
> > > On 1/28/20 8:39 AM, Paul Moore wrote:
> > > > On Tue, Jan 28, 2020 at 7:50 AM syzbot
> > > > <syzbot+61cba5033e2072d61806@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > > >>
> > > >> syzbot has found a reproducer for the following crash on:
> > > >>
> > > >> HEAD commit: b0be0eff Merge tag 'x86-pti-2020-01-28' of git://git.kerne..
> > > >> git tree: upstream
> > > >> console output: https://syzkaller.appspot.com/x/log.txt?x=1432aebee00000
> > > >> kernel config: https://syzkaller.appspot.com/x/.config?x=9784e57c96a92f20
> > > >> dashboard link: https://syzkaller.appspot.com/bug?extid=61cba5033e2072d61806
> > > >> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > > >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10088e95e00000
> > > >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13fa605ee00000
> > > >>
> > > >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > >> Reported-by: syzbot+61cba5033e2072d61806@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > >>
> > > >> =====================================================
> > > >> WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
> > > >> 5.5.0-syzkaller #0 Not tainted
> > > >> -----------------------------------------------------
> > > >> syz-executor305/10624 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire:
> > > >> ffff888098c14098 (&(&s->cache_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > > >> ffff888098c14098 (&(&s->cache_lock)->rlock){+.+.}, at: sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533
> > > >>
> > > >> and this task is already holding:
> > > >> ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > > >> ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
> > > >> which would create a new lock dependency:
> > > >> (&(&nf_conntrack_locks[i])->rlock){+.-.} -> (&(&s->cache_lock)->rlock){+.+.}
> > > >>
> > > >> but this new dependency connects a SOFTIRQ-irq-safe lock:
> > > >> (&(&nf_conntrack_locks[i])->rlock){+.-.}
> > > >>
> > > >> ... which became SOFTIRQ-irq-safe at:
> > > >> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
> > > >> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> > > >> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> > > >> spin_lock include/linux/spinlock.h:338 [inline]
> > > >> nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
> > > >
> > > > ...
> > > >
> > > >> to a SOFTIRQ-irq-unsafe lock:
> > > >> (&(&s->cache_lock)->rlock){+.+.}
> > > >>
> > > >> ... which became SOFTIRQ-irq-unsafe at:
> > > >> ...
> > > >> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
> > > >> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> > > >> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> > > >> spin_lock include/linux/spinlock.h:338 [inline]
> > > >> sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533
> > > >> sidtab_sid2str_put+0xa0/0xc0 security/selinux/ss/sidtab.c:566
> > > >> sidtab_entry_to_string security/selinux/ss/services.c:1279 [inline]
> > > >> sidtab_entry_to_string+0xf2/0x110 security/selinux/ss/services.c:1266
> > > >> security_sid_to_context_core+0x2c6/0x3c0 security/selinux/ss/services.c:1361
> > > >> security_sid_to_context+0x34/0x40 security/selinux/ss/services.c:1384
> > > >> avc_audit_post_callback+0x102/0x790 security/selinux/avc.c:709
> > > >> common_lsm_audit+0x5ac/0x1e00 security/lsm_audit.c:466
> > > >> slow_avc_audit+0x16a/0x1f0 security/selinux/avc.c:782
> > > >> avc_audit security/selinux/include/avc.h:140 [inline]
> > > >> avc_has_perm+0x543/0x610 security/selinux/avc.c:1185
> > > >> inode_has_perm+0x1a8/0x230 security/selinux/hooks.c:1631
> > > >> selinux_mmap_file+0x10a/0x1d0 security/selinux/hooks.c:3701
> > > >> security_mmap_file+0xa4/0x1e0 security/security.c:1482
> > > >> vm_mmap_pgoff+0xf0/0x230 mm/util.c:502
> > > >
> > > > ...
> > > >
> > > >> other info that might help us debug this:
> > > >>
> > > >> Possible interrupt unsafe locking scenario:
> > > >>
> > > >> CPU0 CPU1
> > > >> ---- ----
> > > >> lock(&(&s->cache_lock)->rlock);
> > > >> local_irq_disable();
> > > >> lock(&(&nf_conntrack_locks[i])->rlock);
> > > >> lock(&(&s->cache_lock)->rlock);
> > > >> <Interrupt>
> > > >> lock(&(&nf_conntrack_locks[i])->rlock);
> > > >>
> > > >> *** DEADLOCK ***
> > > >>
> > > >> 4 locks held by syz-executor305/10624:
> > > >> #0: ffffffff8c1acc68 (&table[i].mutex){+.+.}, at: nfnl_lock net/netfilter/nfnetlink.c:62 [inline]
> > > >> #0: ffffffff8c1acc68 (&table[i].mutex){+.+.}, at: nfnetlink_rcv_msg+0x9ee/0xfb0 net/netfilter/nfnetlink.c:224
> > > >> #1: ffff8880836415d8 (nlk_cb_mutex-NETFILTER){+.+.}, at: netlink_dump+0xe7/0xfb0 net/netlink/af_netlink.c:2199
> > > >> #2: ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > > >> #2: ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
> > > >> #3: ffffffff8b7df008 (&selinux_ss.policy_rwlock){.+.?}, at: security_sid_to_context_core+0x1ca/0x3c0 security/selinux/ss/services.c:1344
> > > >
> > > > I think this is going to be tricky to fix due to the differing
> > > > contexts from which sidtab_sid2str_put() may be called. We already
> > > > have a check for !in_task() in sidtab_sid2str_put(), do we want to add
> > > > a check for !in_serving_softirq() too?
> > >
> > > No, we should just use spin_lock_irqsave/unlock_irqrestore() IMHO, but
> > > that then means we need to re-evaluate the performance gain of this change.
> >
> > I just tested a patch that switches to the IRQ-disabling locking under
> > the systemd-journald scenario from [1] and the impact seems not that
> > bad - the security_secid_to_secctx() function only takes up 3.18% of
> > total run time vs. ~2% I reported in the original patch
> > (d97bd23c2d7d). I'll post the patch shortly.
> >
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259, i.e.:
> > cat /dev/urandom | base64 | logger &
> > timeout 30s perf record -p $(pidof systemd-journald) -a -g
>
> I'm skeptical of the real world impact of disabling IRQs here,
> although I'm not sure off the top of my head how we could manage a
> cache (I'm thinking more about adding to, and evicting entries, from
> the cache) without some form of locking. Perhaps we look into per-cpu
> caching?

(I hope the ML won't mind an SVG attachment...) This is a flamegraph
of the scenario above, with the sipn_[un]lock_irq{save|restore}()
patch applied. You can see that the overhead of locking is not so bad
- the kmemdup (which is unavoidable here) still takes up most of the
computation time. We can try to find a more clever cache
implementation later, but this solution should be still much better
than (in case of dense MCS contexts) to comparable to (in case of no
MCS categories) no caching.

BTW I'm trying to do some final testing of the patch on a multi-CPU
system, but the current rawhide 5.5.0-1 kernel seems to have some bug
that prevents it from booting on anything with more than one core.
I'll see if I can work around it somehow...

--
Ondrej Mosnacek <omosnace at redhat dot com>
Software Engineer, Security Technologies
Red Hat, Inc.

Attachment: BEAKERLIB_134310003_STORED_flamegraph.svg
Description: image/svg