Re: [PATCH 1/2] Disk hot removal causing oopses and fixes

From: Jarkko Lavinen
Date: Tue Nov 03 2009 - 06:55:23 EST

Hi Steven

Sorry for late reply.

> It has to reference-count its objects so that they are not freed as long
> as they are used by upper layers,

The block layer and device removal seems to be designed from
top-down approach. Althouh disc is referenced from
__blkdev_get(), disc's request queue is not. Also
blk_cleanup_queue() calls elevator_exit() without caring if
anyone still uses the elevator.

A remedy would be to take reference of the request queue in
__blkdev_get() and move elevator_exit() from blk_cleanup_queue()
to blk_release_queue().

I've tried this and it works fine. I am unable to cause oops within
the elevator or anywhere else with hot card removal. This is how
it should be, since first the queue is marked dead and no new
requests are added into queue and old requests are flushed with
elevator_exit before releasing it and the dead status takes care
they don't even reach to requst issue function.

The caveat is ihat when request queue is initialzed with
blk_init_queue(), caller provides a pointer to spinlock. When
blk_cleanup_queue() is called, drivers release the lock before
elevator_exit() is finished. In mmc driver the struct containing
queue_lock is released immediately after returning blk_cleanup_queue().
Then later when request queue is released, elevator_exit() tries to
use the released spinlock.

Kind of hack workaround is to reset the queue_lock pointer to
request queue's internal lock in blk_cleanup_queue().

Jarkko Lavvinen