Re: [PATCH] block: make gendisk hold a reference to its queue

From: Tejun Heo
Date: Mon Oct 17 2011 - 22:55:23 EST


On Mon, Oct 17, 2011 at 01:44:09PM +0200, Jens Axboe wrote:
> On 2011-10-17 04:43, Tejun Heo wrote:
> > The following command sequence triggers an oops.
> >
> > # mount /dev/sdb1 /mnt
> > # echo 1 > /sys/class/scsi_device/0\:0\:1\:0/device/delete
> > # umount /mnt
> >
> > general protection fault: 0000 [#1] PREEMPT SMP
> > CPU 2
> > Modules linked in:
> >
> > Pid: 791, comm: umount Not tainted 3.1.0-rc3-work+ #8 Bochs Bochs
> > RIP: 0010:[<ffffffff810d0879>] [<ffffffff810d0879>] __lock_acquire+0x389/0x1d60
> > ...
> > Call Trace:
> > [<ffffffff810d2845>] lock_acquire+0x95/0x140
> > [<ffffffff81aed87b>] _raw_spin_lock+0x3b/0x50
> > [<ffffffff811573bc>] bdi_lock_two+0x5c/0x70
> > [<ffffffff811c2f6c>] bdev_inode_switch_bdi+0x4c/0xf0
> > [<ffffffff811c3fcb>] __blkdev_put+0x11b/0x1d0
> > [<ffffffff811c4010>] __blkdev_put+0x160/0x1d0
> > [<ffffffff811c40df>] blkdev_put+0x5f/0x190
> > [<ffffffff8118f18d>] kill_block_super+0x4d/0x80
> > [<ffffffff8118f4a5>] deactivate_locked_super+0x45/0x70
> > [<ffffffff8119003a>] deactivate_super+0x4a/0x70
> > [<ffffffff811ac4ad>] mntput_no_expire+0xed/0x130
> > [<ffffffff811acf2e>] sys_umount+0x7e/0x3a0
> > [<ffffffff81aeeeab>] system_call_fastpath+0x16/0x1b
> >
> > This is because bdev holds on to disk but disk doesn't pin the
> > associated queue. If a SCSI device is removed while the device is
> > still open, the sdev puts the base reference to the queue on release.
> > When the bdev is finally released, the associated queue is already
> > gone along with the bdi and bdev_inode_switch_bdi() ends up
> > dereferencing already freed bdi.
> >
> > Even if it were not for this bug, disk not holding onto the associated
> > queue is very unusual and error-prone.
> >
> > Fix it by making add_disk() take an extra reference to its queue and
> > put it on disk_release() and ensuring that disk and its fops owner are
> > put in that order after all accesses to the disk and queue are
> > complete.
>
> Thanks, applied.

Ooh, I had the WARN_ON_ONCE() condition wrong in add_disk() as
blk_get_queue() returns 1 on failure and 0 on success. Can you please
flip that?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/