Re: [PATCH v2] kasan: fix deadlock in start_report()

From: Dmitry Vyukov
Date: Thu Feb 09 2023 - 05:45:45 EST


aOn Thu, 9 Feb 2023 at 10:19, 袁帅(Shuai Yuan) <yuanshuai@xxxxxxxx> wrote:
>
> Hi Dmitry Vyukov
>
> Thanks, I see that your means.
>
> Currently, report_suppressed() seem not work in Kasan-HW mode, it always return false.
> Do you think should change the report_suppressed function?
> I don't know why CONFIG_KASAN_HW_TAGS was blocked separately before.

That logic was added by Andrey in:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c068664c97c7cf

Andrey, can we make report_enabled() check current->kasan_depth and
remove report_suppressed()?

Then we can also remove the comment in kasan_report_invalid_free().

It looks like kasan_disable_current() in kmemleak needs to affect
HW_TAGS mode as well:
https://elixir.bootlin.com/linux/v6.2-rc7/source/mm/kmemleak.c#L301

So overall it looks like simplifications and it will fix what Shuai reported.




> -----邮件原件-----
> 发件人: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> 发送时间: 2023年2月9日 16:56
> 收件人: 欧阳炜钊(Weizhao Ouyang) <ouyangweizhao@xxxxxxxx>
> 抄送: Andrey Ryabinin <ryabinin.a.a@xxxxxxxxx>; Alexander Potapenko <glider@xxxxxxxxxx>; Andrey Konovalov <andreyknvl@xxxxxxxxx>; Vincenzo Frascino <vincenzo.frascino@xxxxxxx>; Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; kasan-dev@xxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Weizhao Ouyang <o451686892@xxxxxxxxx>; 袁帅(Shuai Yuan) <yuanshuai@xxxxxxxx>; 任立鹏(Peng Ren) <renlipeng@xxxxxxxx>
> 主题: Re: [PATCH v2] kasan: fix deadlock in start_report()
>
> On Thu, 9 Feb 2023 at 04:27, Weizhao Ouyang <ouyangweizhao@xxxxxxxx> wrote:
> >
> > From: Weizhao Ouyang <o451686892@xxxxxxxxx>
> >
> > From: Shuai Yuan <yuanshuai@xxxxxxxx>
> >
> > Calling start_report() again between start_report() and end_report()
> > will result in a race issue for the report_lock. In extreme cases this
> > problem arose in Kunit tests in the hardware tag-based Kasan mode.
> >
> > For example, when an invalid memory release problem is found,
> > kasan_report_invalid_free() will print error log, but if an MTE
> > exception is raised during the output log, the kasan_report() is
> > called, resulting in a deadlock problem. The kasan_depth not protect
> > it in hardware tag-based Kasan mode.
>
> I think checking report_suppressed() would be cleaner and simpler than ignoring all trylock failures. If trylock fails, it does not mean that the current thread is holding it. We of course could do a custom lock which stores current->tid in the lock word, but it looks effectively equivalent to checking report_suppressed().
>
>
>
> > Signed-off-by: Shuai Yuan <yuanshuai@xxxxxxxx>
> > Reviewed-by: Weizhao Ouyang <ouyangweizhao@xxxxxxxx>
> > Reviewed-by: Peng Ren <renlipeng@xxxxxxxx>
> > ---
> > Changes in v2:
> > -- remove redundant log
> >
> > mm/kasan/report.c | 25 ++++++++++++++++++++-----
> > 1 file changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/kasan/report.c b/mm/kasan/report.c index
> > 22598b20c7b7..aa39aa8b1855 100644
> > --- a/mm/kasan/report.c
> > +++ b/mm/kasan/report.c
> > @@ -166,7 +166,7 @@ static inline void fail_non_kasan_kunit_test(void)
> > { }
> >
> > static DEFINE_SPINLOCK(report_lock);
> >
> > -static void start_report(unsigned long *flags, bool sync)
> > +static bool start_report(unsigned long *flags, bool sync)
> > {
> > fail_non_kasan_kunit_test();
> > /* Respect the /proc/sys/kernel/traceoff_on_warning interface.
> > */ @@ -175,8 +175,13 @@ static void start_report(unsigned long *flags, bool sync)
> > lockdep_off();
> > /* Make sure we don't end up in loop. */
> > kasan_disable_current();
> > - spin_lock_irqsave(&report_lock, *flags);
> > + if (!spin_trylock_irqsave(&report_lock, *flags)) {
> > + lockdep_on();
> > + kasan_enable_current();
> > + return false;
> > + }
> >
> > pr_err("==============================================================
> > ====\n");
> > + return true;
> > }
> >
> > static void end_report(unsigned long *flags, void *addr) @@ -468,7
> > +473,10 @@ void kasan_report_invalid_free(void *ptr, unsigned long ip, enum kasan_report_ty
> > if (unlikely(!report_enabled()))
> > return;
> >
> > - start_report(&flags, true);
> > + if (!start_report(&flags, true)) {
> > + pr_err("%s: report ignore\n", __func__);
> > + return;
> > + }
> >
> > memset(&info, 0, sizeof(info));
> > info.type = type;
> > @@ -503,7 +511,11 @@ bool kasan_report(unsigned long addr, size_t size, bool is_write,
> > goto out;
> > }
> >
> > - start_report(&irq_flags, true);
> > + if (!start_report(&irq_flags, true)) {
> > + ret = false;
> > + pr_err("%s: report ignore\n", __func__);
> > + goto out;
> > + }
> >
> > memset(&info, 0, sizeof(info));
> > info.type = KASAN_REPORT_ACCESS; @@ -536,7 +548,10 @@ void
> > kasan_report_async(void)
> > if (unlikely(!report_enabled()))
> > return;
> >
> > - start_report(&flags, false);
> > + if (!start_report(&flags, false)) {
> > + pr_err("%s: report ignore\n", __func__);
> > + return;
> > + }
> > pr_err("BUG: KASAN: invalid-access\n");
> > pr_err("Asynchronous fault: no details available\n");
> > pr_err("\n");
> > --
> > 2.25.1
> >
> > --
> > You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@xxxxxxxxxxxxxxxx.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20230209031159.2337445-1-ouyangweizhao%40zeku.com.
> ZEKU
> 信息安全声明:本邮件包含信息归发件人所在组织ZEKU所有。 禁止任何人在未经授权的情况下以任何形式(包括但不限于全部或部分披露、复制或传播)使用包含的信息。若您错收了本邮件,请立即电话或邮件通知发件人,并删除本邮件及附件。
> Information Security Notice: The information contained in this mail is solely property of the sender's organization ZEKU. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it.