Re: [PATCH 16/16] dm-zoned: ensure only power of 2 zone sizes are allowed

From: Damien Le Moal
Date: Thu Apr 28 2022 - 17:44:16 EST


On 4/29/22 02:34, Luis Chamberlain wrote:
> On Thu, Apr 28, 2022 at 08:42:41AM +0900, Damien Le Moal wrote:
>> On 4/28/22 01:02, Pankaj Raghav wrote:
>>> From: Luis Chamberlain <mcgrof@xxxxxxxxxx>
>>>
>>> Today dm-zoned relies on the assumption that you have a zone size
>>> with a power of 2. Even though the block layer today enforces this
>>> requirement, these devices do exist and so provide a stop-gap measure
>>> to ensure these devices cannot be used by mistake
>>>
>>> Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
>>> Signed-off-by: Pankaj Raghav <p.raghav@xxxxxxxxxxx>
>>> ---
>>> drivers/md/dm-zone.c | 12 ++++++++++++
>>> 1 file changed, 12 insertions(+)
>>>
>>> diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c
>>> index 57daa86c19cf..221e0aa0f1a7 100644
>>> --- a/drivers/md/dm-zone.c
>>> +++ b/drivers/md/dm-zone.c
>>> @@ -231,6 +231,18 @@ static int dm_revalidate_zones(struct mapped_device *md, struct dm_table *t)
>>> struct request_queue *q = md->queue;
>>> unsigned int noio_flag;
>>> int ret;
>>> + struct block_device *bdev = md->disk->part0;
>>> + sector_t zone_sectors;
>>> + char bname[BDEVNAME_SIZE];
>>> +
>>> + zone_sectors = bdev_zone_sectors(bdev);
>>> +
>>> + if (!is_power_of_2(zone_sectors)) {
>>> + DMWARN("%s: %s only power of two zone size supported\n",
>>> + dm_device_name(md),
>>> + bdevname(bdev, bname));
>>> + return 1;
>>> + }
>>
>> Why ?
>>
>> See my previous email about still allowing ZC < ZS for non power of 2 zone
>> size drives. dm-zoned can easily support non power of 2 zone size as long
>> as ZC == ZS for all zones.
>
> Great, thanks for the heads up.
>
>> The problem with dm-zoned is ZC < ZS *AND* potentially variable ZC per
>> zone. That cannot be supported easily (still not impossible, but
>> definitely a lot more complex).
>
> I see thanks.
>
> Testing would still be required to ensure this all works well with npo2.
> So I'd prefer to do that as a separate effort, even if it is easy. So
> for now I think it makes sense to avoid this as this is not yet well
> tested.
>
> As with filesystem support, we've even have gotten hints that support
> for npo2 should be easy, but without proper testing it would not be
> prudent to enable support for users yet.
>
> One step at a time.

Yes, in general, I agree. But in this case, that will create kernel
versions that end up having partial support for zoned drives. Not ideal to
say the least. So if the patches are not that big, I would rather like to
see everything go into a single release.

--
Damien Le Moal
Western Digital Research