Re: 2.2.16 md botch; please revert

From: Neil Brown (neilb@cse.unsw.edu.au)
Date: Mon Jun 12 2000 - 00:05:47 EST


On Friday June 9, Holger.Smolinski@de.ibm.com wrote:
>
>
> Folks,
> Sorry for any inconvenience with the md patch.
> The problem on S/390 systems is, that we have
> blocksizes deviating from the nice 1024 at PCs,
> which caused md not to work with 2k or 4k devices
> underlying. Hence reverting the patch will cause
> problems on S/390...
> btw.: has anybody ever tried md on a MO device
> or something similar with blocksize of 2k and above?

I can see that that is a real problem that needs to be addressed, but
what I cannot see is how the patch in question was supposed to address
it, except by accident.

The patch added the line:
+ md_blocksizes[minor] <<= FACTOR_SHIFT(FACTOR((md_dev+minor)));
fairly early in do_md_run.
Before this runs, md_blocksizes[minor] will be 1024, thanks to
md_geninit, and FACTOR((md_dev+minor)) will be
  0 if you asked for 4K cluster size
  1 if you asked for 8K cluster size
  2 if you asked for 16K cluster size
etc. This assumes 4K PAGE sizes, which is true except on some SPARC and
Alpha.

FACTOR_SHIFT(x) is basically x+2 unless you have a different PAGE_SIZE, in
which case it will be x+3.

So the effect of this patch is to change md_blocksizes[minor] from
1024, to atleast 4096, and possibly much more.
With your extra three lines:
> + if ( md_blocksizes[minor] > PAGE_SIZE ) {
> + md_blocksizes[minor] = PAGE_SIZE;
> + }
it becomes exactly
    md_blocksizes[minor] = PAGE_SIZE
as the first change makes it atleast PAGE_SIZE.

Now md_blocksizes is the array used for blksize_size[MD_MAJOR], and so
contains the current blocksize of the device. Setting it at do_md_run
time sets a default. It can be changed later by filesystems if they
want a different block size. So I can see that setting it to PAGE_SIZE
is useful for you, as PAGE_SIZE is greater than your block size, but I
don't think that it is what you really want.

What I suspect you *really* wanted to do is to set md_blocksizes[minor]
to be the maximum of blksize_size[major][minor] for each underlying
device, where this defaults to 1024 if blksize_size[major] is NULL.

Something like:
  blksz=0;
  for (i=0; i<md_dev[minor].nb_dev; i++) {
    int dev = md_dev[minor].devices[i].dev;
    int bs = 1024;
    if (blksize_size[MAJOR(dev)] && blksize_size[MAJOR(dev)][MINOR(dev)])
      bs = blksize_size[MAJOR(dev)][MINOR(dev)];
    if (bs > blksz) blksz = bs;
  }
  if (bs == 0) blksz = 1024;
  md_blocksizes[minor]=bs;

though I haven't actually tried this (but then I don't have any
devices with have block sizes > 1024).

NeilBrown

>
> The patch below limits the md_blocksize
> to PAGE SIZE, which should at least keep the buffer
> cache working...
>
> diff -u -r1.2 md.c
> --- drivers/block/md.c 2000/05/03 14:37:31 1.2
> +++ drivers/block/md.c 2000/06/09 13:18:51
> @@ -447,6 +447,9 @@
> factor = min = 1 << FACTOR_SHIFT(FACTOR((md_dev+minor)));
>
> md_blocksizes[minor] <<= FACTOR_SHIFT(FACTOR((md_dev+minor)));
> + if ( md_blocksizes[minor] > PAGE_SIZE ) {
> + md_blocksizes[minor] = PAGE_SIZE;
> + }
>
> for (i=0; i<md_dev[minor].nb_dev; i++)
> if (md_dev[minor].devices[i].size<min)
>
> Gruesse / Regards
>
> Dr. Holger Smolinski
> Linux for S/390 Design & Development,
> IBM Deutschland Entwicklung GmbH, Schoenaicher Str. 220, D-71032 Boeblingen
> Phone/FAX: +49 7031 16 (x902) - 4652/3456
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jun 15 2000 - 21:00:24 EST