Re: [PATCHv3 9/9] zram: add dynamic device add/remove functionality

From: Minchan Kim
Date: Sun May 03 2015 - 22:28:37 EST


On Mon, May 04, 2015 at 11:20:08AM +0900, Minchan Kim wrote:
> Hello Sergey,
>
> On Thu, Apr 30, 2015 at 03:51:12PM +0900, Sergey Senozhatsky wrote:
> > On (04/30/15 15:44), Minchan Kim wrote:
> > > > > I think the problem of deadlock is that you are trying to remove sysfs file
> > > > > in sysfs handler.
> > > > >
> > > > > #> echo 1 > /sys/xxx/zram_remove
> > > > >
> > > > > kernfs_fop_write - hold s_active
> > > > > -> zram_remove_store
> > > > > -> zram_remove
> > > > > -> sysfs_remove_group - hold s_active *again*
> > > > >
> > > > > Right?
> > > > >
> > > >
> > > > are those same s_active locks?
> > > >
> > > >
> > > > we hold (s_active#163) and (&bdev->bd_mutex) and want to acquire (s_active#162)
> > >
> > > Thanks for sharing the message.
> > > You're right. It's another lock so it shouldn't be a reason.
> > > Okay, I will review it. Please give me time.
> > >
> >
> > sure, no problem and no rush. thanks!
>
> I had a time to think over it.
>
> I think your patch is rather tricky so someone cannot see sysfs
> although he already opened /dev/zram but after a while he can see sysfs.
> It's weired.
>
> I want to fix it more generic way. Othewise, we might have trouble with
> locking problem sometime. We already have experieced it with init_lock
> although we finally fixed it.
>
> I think we can fix it with below patch I hope it's more general and right
> approach. It's based on your [zram: return zram device_id from zram_add()]
>
> What do you think about?
>
> From e943df5407b880f9262ef959b270226fdc81bc9f Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@xxxxxxxxxx>
> Date: Mon, 4 May 2015 08:36:07 +0900
> Subject: [PATCH 1/2] zram: close race by open overriding
>
> [1] introduced bdev->bd_mutex to protect a race between mount
> and reset. At that time, we don't have dynamic zram-add/remove
> feature so it was okay.
>
> However, as we introduce dynamic device feature, bd_mutex became
> trouble.
>
> CPU 0
>
> echo 1 > /sys/block/zram<id>/reset
> -> kernfs->s_active(A)
> -> zram:reset_store->bd_mutex(B)
>
> CPU 1
>
> echo <id> > /sys/class/zram/zram-remove
> ->zram:zram_remove: bd_mutex(B)
> -> sysfs_remove_group
> -> kernfs->s_active(A)
>
> IOW, AB -> BA deadlock
>
> The reason we are holding bd_mutex for zram_remove is to prevent
> any incoming open /dev/zram[0-9]. Otherwise, we could remove zram
> others already have opened. But it causes above deadlock problem.
>
> To fix the problem, this patch overrides block_device.open and
> it returns -EBUSY if zram asserts he claims zram to reset so any
> incoming open will be failed so we don't need to hold bd_mutex
> for zram_remove ayn more.
>
> This patch is to prepare for zram-add/remove feature.
>
> [1] ba6b17: zram: fix umount-reset_store-mount race condition
> Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>

If above has no problem, we could apply your last patch on top of it.