Re: [BUG] mm: backing-dev: a possible sleep-in-atomic-context bug in cgwb_create()

From: Matthew Wilcox
Date: Wed Jun 20 2018 - 23:35:36 EST


On Thu, Jun 21, 2018 at 11:02:58AM +0800, Jia-Ju Bai wrote:
> The kernel may sleep with holding a spinlock.
> The function call path (from bottom to top) in Linux-4.16.7 is:
>
> [FUNC] schedule
> lib/percpu-refcount.c, 222:
> schedule in __percpu_ref_switch_mode
> lib/percpu-refcount.c, 339:
> __percpu_ref_switch_mode in percpu_ref_kill_and_confirm
> ./include/linux/percpu-refcount.h, 127:
> percpu_ref_kill_and_confirm in percpu_ref_kill
> mm/backing-dev.c, 545:
> percpu_ref_kill in cgwb_kill
> mm/backing-dev.c, 576:
> cgwb_kill in cgwb_create
> mm/backing-dev.c, 573:
> _raw_spin_lock_irqsave in cgwb_create
>
> This bug is found by my static analysis tool (DSAC-2) and checked by my
> code review.

I disagree with your code review.

* If the previous ATOMIC switching hasn't finished yet, wait for
* its completion. If the caller ensures that ATOMIC switching
* isn't in progress, this function can be called from any context.

I believe cgwb_kill is always called under the spinlock, so we will never
sleep because the percpu ref will never be switching to atomic mode.

This is complex and subtle, so I could be wrong.