Re: Reference counting of MMC host driver modules

From: Enrik Berkhan
Date: Mon Jan 12 2009 - 11:32:19 EST


Hi,

Pierre Ossman wrote:
> Stefan Richter <stefanr@xxxxxxxxxxxxxxxxx> wrote:
> > So, if somebody asked me to copy my ieee1394/sbp2 safeguard into
> > firewire/fw-sbp2, I would reject that on the grounds that killing the
> > connection to the FireWire disk is the *expected result* of
> > # modprobe -r firewire-ohci
> I have to agree with Stefan's reasoning here. The reference counting is
> about protecting kernel integrity, not about saving the user's foot.

Yes, meanwhile, I have to admit you all are right and I have been
looking in the wrong direction.

My original problem was the following:

- MD raid0 made of some MMC/SD devices
- raid0 activated
- rmmod mmc_host_driver (user error, not noticing the raid is active)
- insmod mmc_host_driver (user error, still not noticing the raid is active)
- write to raid0 device
- kernel crash

I have debugged this a little and found the following reason for the
crash:

When removing the mmc_host_driver, everything seems to be fine; the
MMC/SD block device has been deactivated by mmc_blk_remove(), that in
turn has stopped the queue via mmc_cleanup_queue(). mmc_cleanup_queue()
calls blk_cleanup_queue() on the underlying struct request_queue. By
this, the reference count of the struct request_queues kboj drops to
zero. The MD driver still has the block device open and, actually,
things work fine unless the memory of the struct request_queue isn't
touched, because it is marked dead. Of course, accessing the MD device
returns EIO, but that's fine.

When the mmc_host_driver is reloaded, new struct request_queues will be
allocated and with some probability, the old memory will be re-used for
them or the old memory locations will be re-used for something else. The
key point is that the queues still in use by the MD layer will
effectively no longer be marked dead or completely corrupted.

I don't know if this is a problem of the MD layer, the MMC/SD block
driver, a more general problem or even no problem at all :)

How can this be fixed?

Enrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/