Re: WARNING: bad unlock balance detected! - mkfs.ext4/426 is trying to release lock (rcu_read_lock)

From: Matthew Wilcox
Date: Mon Dec 07 2020 - 01:08:46 EST


On Mon, Dec 07, 2020 at 11:17:29AM +0530, Naresh Kamboju wrote:
> While running "mkfs -t ext4" on arm64 juno-r2 device connected with SSD drive
> the following kernel warning reported on stable rc 5.9.13-rc1 kernel.
>
> Steps to reproduce:
> ------------------
> # boot arm64 Juno-r2 device with stable-rc 5.9.13-rc1.
> # Connect SSD drive
> # Format the file system ext4 type
> mkfs -t ext4 <SSD-drive>
> # you will notice this warning

Does it happen easily? Can you bisect?

> Crash log:
> --------------
> Writing superblocks and filesystem accounting information: 0/895
> [ 86.131095]
> [ 86.132592] =====================================
> [ 86.137300] WARNING: bad unlock balance detected!
> [ 86.142012] 5.9.13-rc1 #1 Not tainted
> [ 86.145675] -------------------------------------
> [ 86.150384] mkfs.ext4/426 is trying to release lock (rcu_read_lock) at:
> [ 86.157020] [<ffff80001063478c>] blk_queue_exit+0xcc/0x1b0
> [ 86.162511] but there are no more locks to release!

This really doesn't make much sense. blk_queue_exit() in 5.9.12 does:

percpu_ref_put(&q->q_usage_counter);
(literally, that's the entire function)

percpu_ref_put() does:

rcu_read_lock();

if (__ref_is_percpu(ref, &percpu_count))
this_cpu_sub(*percpu_count, nr);
else if (unlikely(atomic_long_sub_and_test(nr, &ref->count)))
ref->release(ref);

rcu_read_unlock();

Unless ->release() has an unbalanced rcu_read_unlock(), there definitely
is a lock to release! Some archaeology says that ->release is
blk_queue_usage_counter_release(), which calls
wake_up_all(&q->mq_freeze_wq);

which doesn't appear to use RCU at all. So this trace makes no sense,
and all I can do is ask you to bisect it.