Re: [ANNOUNCE] block device interfaces changes

From: Andries.Brouwer@cwi.nl
Date: Sat Jan 08 2000 - 20:47:47 EST


> From viro@math.psu.edu Sat Jan 8 21:10:14 2000

Ha Alexander,

1) ioctl.
This has nothing to do with the present topic.

        First of all, I _don't_ _want_ ioctl().

We have it, and this time we are not going to get rid of it.

2) character and block devices.
You still talk about these as if there exists a dichotomy.
But that is not a productive point of view, I think. There is
a class: all devices, and a subclass: all block devices.
Some things need to be handled for all devices - and there is
no need to distinguish - like stat, mknod, open in the userspace
interface, and register_device() and getname() kernel-internally.
We will have lots of devices, both block and non-block devices,
and 64-bits at first, arbitrary strings later, identifying them.
Thus there is a cache, an associative memory, to find them in.
This is common to all devices. There is no reason to implement
this twice, first for block devices and then for non-block devices.

3) dev_t
> * New type (struct block_device) is defined.
> We have a cache of such objects, indexed by dev_t.
> struct block_device * is going to replace kdev_t
> for block devices.
>
> Good!
> What do you mean by dev_t?

        That cache is populated upon foo_read_inode() (by
        init_special_inode()) and the search key in this cache
        is (major+minor). Notice that such searches should be done
        _only_ upon the interaction with user / filesystem.
        Internally kernel should just use the pointer. Period.

Yes, that is how my code works. The pointer type is called kdev_t.
But it works like that both for character and for block devices, and
the search key is b/c+major+minor. And my timing is different -
foo_read_inode() is the wrong moment to do this - see below.

When you set up the data structures, keep in mind that the kernel
often derives one dev_t from another. If you want to avoid lookups
then you need to be able to find the entire disk from a partition,
and all partitions from a disk. Check all cases of MKDEV().
I found eliminating all occurrences of MKDEV one of the more
time-consuming parts of the change. Many driver structures
need to change a little.

Another thing to watch out for is that there may be no device
corresponding to a dev_t. If I read a CDROM then it may contain
special device nodes for many devices that are not represented
in the kernel. Thus, the init_special_inode() in isofs_read_inode()
must not yet create the device_struct, it is too early. First
when the inode is opened should you try to construct the device struct.
(Maybe you saw my October '99 post on this stuff. I needed MKBDEV,
MKCDEV and MKXDEV, where the latter occurs in a
"dev_t number only, no device" context.)

> Don't forget to include the bdev/cdev bit in devno.]

        No, thanks. Character devices _are_ different; by coincidence we
        are using 16bit integers to refer both to block and character
        devices, but that's about it.

If you forget to take the b/c bit into account then you take away the
possibility of simplifying the kernel later.

4. Names.

> I needed a little more stuff.
> One thing is that the kernel does not need names for files,
> but it needs names for devices since it uses device names in
> printk. Thus you need something like
> char (*devname)(struct block_device *bdev);

        Umm... I'ld rather had that done statically, in register_...()
        time. No need to breed complexity...

That is what I did first. But it didnt suffice.
We want to generate the familiar names like hda6 and fd0h1200 and initrd,
and only the driver itself knows the algorithm to convert minors
to strings. But of course there is a generic routine that works
in most cases.

5. Devices and subdevices.

> Another thing is that I used two types, let us call them here
> major_struct and minor_struct, where a kdev_t was a pointer to
> a minor_struct, and the major_struct had a method
> kdev_t getdev(major_struct *dev, int minor);
> to retrieve a minor if it existed or create it if not.
> Probably this is part of your open, but it is not clear to me
> how you handle the difference between major_struct and minor_struct.
> They are rather different.

        major_struct is completely bogus. We don't have 1-1 correspondence
        between majors and drivers. We don't have 1-1 correspondence between
        majors and physical units (disks).

If major_struct displeases you so much that you don't understand
what I meant, please read driver_struct or device_struct instead.
But note that there are three levels:
partition: hda6, disk: hda, ide disk driver: hda, hdb, hdc, ...
When I say major_struct then I mean the disk level, not the partition
or the driver level. Things like media_change and read_partition_table
happen at this level. The request queue is at this level.

My kdev_t is the subdevice, is the b/c+major+minor thing.
I am trying to understand whether your struct block_device
represents the subdevice or the entire device.

Instead of subdevice I should perhaps say "logical device".
The examples of md and loop show relationships other than that
of partition to disk.

Andries

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jan 15 2000 - 21:00:13 EST