Re: Volume Managers in Linux

Aaron Wrasman (awrasman@enchanted.net)
Tue, 3 Nov 1998 18:11:57 -0600


On Tue, Nov 03, 1998 at 03:01:11PM -0500, Theodore Y. Ts'o wrote:
> Date: Mon, 2 Nov 1998 17:31:19 -0600 (CST)
> From: Shawn Leas <sleas@ixion.honeywell.com>
>
> All well and good. BUT what of simple multi-volume block devices?
> Requiring userland to use EXT2 to get RAID or multivolume is broken.
> I just think MD is not the way to go about it.
>
> I've never claimed that the ext2 is the best way to do RAID; I think MD
> is the way to do that. However, allowing ext2 to be able to support
> filesystems which span multiple block devices is a good thing to do, and
> a cleaner way of supporting multivolume support. Examples of
> filesystems which do this include the UDF filesystem used by DVD-ROM's,
> and Digital Unix's Advanced Filesystem.
>
> The reason why I think the LVM approach is more complex than it needs to
> be is its approach of taking a disk, and chopping it into thousands of
> 4MB pieces, as if LVM were a Ginsu knife, and then having to stich those
> pieces back together into LVM partitions.
>
> In fact, if you look at the AIX LVM implementation, yes, it does chop
> the disks into little-bitty 4M pieces, but the filesystem is also
> involved in the picture. So (typically of AIX) it has all of the
> complexity of both worlds. :-)
>
> Does the 4M ginsu-knife approach buy you something? Yes, it allows you
> to have infinitely configurable partitions, which can be scattered
> across the entire disk in a non-contiguous fashion. Whether or not this
> is a good thing or not can be debated. I will say that if the
> filesystem isn't involved, some of its optimizations to reduce seek
> times get thrown out the door since there are no guarantees whether
> adjacent 4M blocks are anywhere near each other or not. Then again,
> some people may prefer the ability to create partitions without needing
> any kind of advance planning as being more important than performance.
> (I don't, but clearly some people do.)
>
> - Ted
>
>

The filesystem has nothing to do with it. All LV gives you on AIX is a way to
lay things out on multiple physical disks. LVs are just part of a volume group.
(A volume group is a group of disks up to 32). You can then lay them out how
you want as a Logical volume. For example I want a 16 gig logical volume
across 16 disks and I want it in the middle of each disk because
it is going to be the index for my database which takes up the outer and inner
part of disks. Or better yet. I tell it to do the same
but make a mirrored copy on the other 16 disks in my volume group. I then
create the other 8 logical volumes that contain my data for my relational
database. They are on the inner and outer part of the physical disks. I use 64
meg pieces (or PPs as AIX calls them) to do all this not 4 megs. 4 megs is
the default if you use smaller disks (i.e. less than 9 gig physical disks)
and/or just a few disks.

After I do this I then tell Informix or Oracle or what have you about the raw
partitions I just created.

Filesystems have nothing to do with it. But most people that use AIX never get
past the admin tool called smit. It tends to hide the options you have and the
commands unless you look for them.

The filesystem does have to do with part of this when you grow a filesystem by
adding more PPs to it. But that makes sense if you add 32 megs
or 500 megs to a LV that has a filesytem you want the filesystem to then also
be able to use the extra space.

Other things that LV's give you on AIX. You can make a mirror after the fact,
while using the LV. You can re-org the LV's as long as you have enough
unallocated PPS. For example say you are tight on space and you have
everything mirrored. You need some extra space to export a database and do
a new import because you are restructiong the database. You don't have enough
disk space unallocated. So you break a few mirrors. You now have enough
unallocated space. You export port you data and re-import it. You re-make your
mirror. This is much faster than tape back ups.

Things it doesn't give you. Moving things from one volume group(VG) to another.

Problems I have seen with it. (Rarely.)

It keeps track of the LVs and the VG on the disks and in the ODM (or registry
if you like).

There is a good bit of overhead for the VG and LVs. A copy is kept on each
physical disk. If your ODM get's corrupted. You have to tell the system to
completely forget about your VG and then tell it to re-import the VG from the
disk.

Most of my examples involved databases. And that is where we have found to be
useful. Where we have some idea about how the data is going to act so we can
reduce time to access the data.

Aaron

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/