Re: [PATCH 16/16] dm-zoned: ensure only power of 2 zone sizes are allowed

From: Luis Chamberlain
Date: Thu Apr 28 2022 - 13:34:28 EST


On Thu, Apr 28, 2022 at 08:42:41AM +0900, Damien Le Moal wrote:
> On 4/28/22 01:02, Pankaj Raghav wrote:
> > From: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> >
> > Today dm-zoned relies on the assumption that you have a zone size
> > with a power of 2. Even though the block layer today enforces this
> > requirement, these devices do exist and so provide a stop-gap measure
> > to ensure these devices cannot be used by mistake
> >
> > Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> > Signed-off-by: Pankaj Raghav <p.raghav@xxxxxxxxxxx>
> > ---
> > drivers/md/dm-zone.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c
> > index 57daa86c19cf..221e0aa0f1a7 100644
> > --- a/drivers/md/dm-zone.c
> > +++ b/drivers/md/dm-zone.c
> > @@ -231,6 +231,18 @@ static int dm_revalidate_zones(struct mapped_device *md, struct dm_table *t)
> > struct request_queue *q = md->queue;
> > unsigned int noio_flag;
> > int ret;
> > + struct block_device *bdev = md->disk->part0;
> > + sector_t zone_sectors;
> > + char bname[BDEVNAME_SIZE];
> > +
> > + zone_sectors = bdev_zone_sectors(bdev);
> > +
> > + if (!is_power_of_2(zone_sectors)) {
> > + DMWARN("%s: %s only power of two zone size supported\n",
> > + dm_device_name(md),
> > + bdevname(bdev, bname));
> > + return 1;
> > + }
>
> Why ?
>
> See my previous email about still allowing ZC < ZS for non power of 2 zone
> size drives. dm-zoned can easily support non power of 2 zone size as long
> as ZC == ZS for all zones.

Great, thanks for the heads up.

> The problem with dm-zoned is ZC < ZS *AND* potentially variable ZC per
> zone. That cannot be supported easily (still not impossible, but
> definitely a lot more complex).

I see thanks.

Testing would still be required to ensure this all works well with npo2.
So I'd prefer to do that as a separate effort, even if it is easy. So
for now I think it makes sense to avoid this as this is not yet well
tested.

As with filesystem support, we've even have gotten hints that support
for npo2 should be easy, but without proper testing it would not be
prudent to enable support for users yet.

One step at a time.

Luis