Re: [PATCH v2] kasan: fix deadlock in start_report()

From: Dmitry Vyukov
Date: Thu Feb 09 2023 - 03:55:58 EST


On Thu, 9 Feb 2023 at 04:27, Weizhao Ouyang <ouyangweizhao@xxxxxxxx> wrote:
>
> From: Weizhao Ouyang <o451686892@xxxxxxxxx>
>
> From: Shuai Yuan <yuanshuai@xxxxxxxx>
>
> Calling start_report() again between start_report() and end_report()
> will result in a race issue for the report_lock. In extreme cases this
> problem arose in Kunit tests in the hardware tag-based Kasan mode.
>
> For example, when an invalid memory release problem is found,
> kasan_report_invalid_free() will print error log, but if an MTE exception
> is raised during the output log, the kasan_report() is called, resulting
> in a deadlock problem. The kasan_depth not protect it in hardware
> tag-based Kasan mode.

I think checking report_suppressed() would be cleaner and simpler than
ignoring all trylock failures. If trylock fails, it does not mean that
the current thread is holding it. We of course could do a custom lock
which stores current->tid in the lock word, but it looks effectively
equivalent to checking report_suppressed().



> Signed-off-by: Shuai Yuan <yuanshuai@xxxxxxxx>
> Reviewed-by: Weizhao Ouyang <ouyangweizhao@xxxxxxxx>
> Reviewed-by: Peng Ren <renlipeng@xxxxxxxx>
> ---
> Changes in v2:
> -- remove redundant log
>
> mm/kasan/report.c | 25 ++++++++++++++++++++-----
> 1 file changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index 22598b20c7b7..aa39aa8b1855 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -166,7 +166,7 @@ static inline void fail_non_kasan_kunit_test(void) { }
>
> static DEFINE_SPINLOCK(report_lock);
>
> -static void start_report(unsigned long *flags, bool sync)
> +static bool start_report(unsigned long *flags, bool sync)
> {
> fail_non_kasan_kunit_test();
> /* Respect the /proc/sys/kernel/traceoff_on_warning interface. */
> @@ -175,8 +175,13 @@ static void start_report(unsigned long *flags, bool sync)
> lockdep_off();
> /* Make sure we don't end up in loop. */
> kasan_disable_current();
> - spin_lock_irqsave(&report_lock, *flags);
> + if (!spin_trylock_irqsave(&report_lock, *flags)) {
> + lockdep_on();
> + kasan_enable_current();
> + return false;
> + }
> pr_err("==================================================================\n");
> + return true;
> }
>
> static void end_report(unsigned long *flags, void *addr)
> @@ -468,7 +473,10 @@ void kasan_report_invalid_free(void *ptr, unsigned long ip, enum kasan_report_ty
> if (unlikely(!report_enabled()))
> return;
>
> - start_report(&flags, true);
> + if (!start_report(&flags, true)) {
> + pr_err("%s: report ignore\n", __func__);
> + return;
> + }
>
> memset(&info, 0, sizeof(info));
> info.type = type;
> @@ -503,7 +511,11 @@ bool kasan_report(unsigned long addr, size_t size, bool is_write,
> goto out;
> }
>
> - start_report(&irq_flags, true);
> + if (!start_report(&irq_flags, true)) {
> + ret = false;
> + pr_err("%s: report ignore\n", __func__);
> + goto out;
> + }
>
> memset(&info, 0, sizeof(info));
> info.type = KASAN_REPORT_ACCESS;
> @@ -536,7 +548,10 @@ void kasan_report_async(void)
> if (unlikely(!report_enabled()))
> return;
>
> - start_report(&flags, false);
> + if (!start_report(&flags, false)) {
> + pr_err("%s: report ignore\n", __func__);
> + return;
> + }
> pr_err("BUG: KASAN: invalid-access\n");
> pr_err("Asynchronous fault: no details available\n");
> pr_err("\n");
> --
> 2.25.1
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@xxxxxxxxxxxxxxxx.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20230209031159.2337445-1-ouyangweizhao%40zeku.com.