Re: bcache: oops when writing to writeback_percent without a cache device

From: Coly Li
Date: Tue Jun 04 2019 - 11:45:48 EST


On 2019/6/4 10:59 äå, Coly Li wrote:
> On 2019/6/4 7:00 äå, BjÃrn Forsman wrote:
>> Hi all,
>>
>> I get a kernel oops from bcache when writing to
>> /sys/block/bcache0/bcache/writeback_percent and there is no attached
>> cache device. See the oops itself below my signature.
>>
>> This is on Linux 4.19.46. I looked in git and see many commits to
>> bcache lately, but none seem to address this particular issue.
>>
>> Background: I'm writing to .../writeback_percent with
>> systemd-tmpfiles. I'd rather not replace it with a script that figures
>> out whether or not the kernel will oops if writing to the sysfs file
>> -- the kernel should not oops in the first place.
>
> Hi Bjorn,
>
> Thank you for the reporting. I believe this is a case we missed in
> testings. When a bcache device is not attached, it does not make sense
> to update the writeback rate in period by the changing of writeback_percent.
>
> I will post a patch for your testing soon.

Hi Bjorn,

Could you please to try this patch ? Hope it may help a bit.

Thanks.

Coly Li


>>
>> Jun 04 12:35:42 kernel: BUG: unable to handle kernel NULL pointer
>> dereference at 0000000000000340
>> Jun 04 12:35:42 kernel: PGD 0 P4D 0
>> Jun 04 12:35:42 kernel: Oops: 0000 [#1] SMP PTI
>> Jun 04 12:35:42 kernel: CPU: 6 PID: 20266 Comm: kworker/6:220 Not
>> tainted 4.19.46 #1-NixOS
>> Jun 04 12:35:42 kernel: Hardware name: To Be Filled By O.E.M. To Be
>> Filled By O.E.M./X99 Extreme4/3.1, BIOS P3.60 04/06/2018
>> Jun 04 12:35:42 kernel: Workqueue: events update_writeback_rate [bcache]
>> Jun 04 12:35:42 kernel: RIP: 0010:update_writeback_rate+0x2f/0x2e0 [bcache]
[snipped]
From 9dea1e6ad0e6c40e3bc6ee6cf4ad243c07ca6ca9 Mon Sep 17 00:00:00 2001
From: Coly Li <colyli@xxxxxxx>
Date: Tue, 4 Jun 2019 23:24:09 +0800
Subject: [PATCH] bcache: only set BCACHE_DEV_WB_RUNNING when cached device
attached

When people set a writeback percent via sysfs file,
/sys/block/bcache<N>/bcache/writeback_percent
current code directly sets BCACHE_DEV_WB_RUNNING to dc->disk.flags
and schedules kworker dc->writeback_rate_update.

If there is no cache set attached to, the writebac kernel thread is
not running indeed, running dc->writeback_rate_update does not make
sense and may cause NULL pointer deference when reference cache set
pointer inside update_writeback_rate().

This patch checks whether the cache set point (dc->disk.c) is NULL in
sysfs interface handler, and only set BCACHE_DEV_WB_RUNNING and
schedule dc->writeback_rate_update when dc->disk.c is not NULL (it
means the cache device is attached to a cache set).

Reported-by: BjÃrn Forsman <bjorn.forsman@xxxxxxxxx>
Signed-off-by: Coly Li <colyli@xxxxxxx>
---
drivers/md/bcache/sysfs.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 129031663cc8..eb678e43ac00 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -430,8 +430,13 @@ STORE(bch_cached_dev)
bch_writeback_queue(dc);
}

+ /*
+ * Only set BCACHE_DEV_WB_RUNNING when cached device attached to
+ * a cache set, otherwise it doesn't make sense.
+ */
if (attr == &sysfs_writeback_percent)
- if (!test_and_set_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags))
+ if ((dc->disk.c != NULL) &&
+ (!test_and_set_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)))
schedule_delayed_work(&dc->writeback_rate_update,
dc->writeback_rate_update_seconds * HZ);

--
2.16.4