RE: [PATCH] kasan: Prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same time

From: Zhang, Qiang1
Date: Mon Apr 25 2022 - 10:25:26 EST


On Thu, 14 Apr 2022 10:59:25 +0800 Zqiang <qiang1.zhang@xxxxxxxxx> wrote:

> The kasan_quarantine_remove_cache() is called in kmem_cache_shrink()/
> destroy(), the kasan_quarantine_remove_cache() call is protected by
> cpuslock in kmem_cache_destroy(), can ensure serialization with
> kasan_cpu_offline(). however the kasan_quarantine_remove_cache() call
> is not protected by cpuslock in kmem_cache_shrink(), when CPU going
> offline and cache shrink occur at same time, the cpu_quarantine may be
> corrupted by interrupt(per_cpu_remove_cache operation). so add
> cpu_quarantine offline flags check in per_cpu_remove_cache().
>
> ...
>

>Could we please have some reviewer input here?

> --- a/mm/kasan/quarantine.c
> +++ b/mm/kasan/quarantine.c
> @@ -330,6 +330,8 @@ static void per_cpu_remove_cache(void *arg)
> struct cpu_shrink_qlist *sq;
> #endif
> q = this_cpu_ptr(&cpu_quarantine);
> + if (READ_ONCE(q->offline))
> + return;
> #ifndef CONFIG_PREEMPT_RT
> qlist_move_cache(q, &to_free, cache);
> qlist_free_all(&to_free, cache);

>It might be helpful to have a little comment which explains why we're doing this?

Sorry for late reply, may be add some comment:

Ensure the ordering between the writing to q->offline and per_cpu_remove_cache.
prevent cpu_quarantine be corrupted by interrupt.

Is this OK ?

Thanks
Zqiang