Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

From: Ming Lei
Date: Wed Mar 22 2023 - 03:59:40 EST


On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>
> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> set, and partition scan will be proceed again when blkdev_get_by_dev()
> is called. However, this will cause a problem that re-assemble partitioned
> raid device will creat partition for underlying disk.
>
> Test procedure:
>
> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> sgdisk -n 0:0:+100MiB /dev/md0
> blockdev --rereadpt /dev/sda
> blockdev --rereadpt /dev/sdb
> mdadm -S /dev/md0
> mdadm -A /dev/md0 /dev/sda /dev/sdb
>
> Test result: underlying disk partition and raid partition can be
> observed at the same time
>
> Note that this can still happen in come corner cases that
> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> device.
>
> Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>

The issue still can't be avoided completely, such as, after rebooting,
/dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
should be underlying partitions scanned before re-assembling raid, I
guess it may not be easy to avoid.

Also seems the following change added in e5cfefa97bcc isn't necessary:

/* Make sure the first partition scan will be proceed */
if (get_capacity(disk) && !(disk->flags & GENHD_FL_NO_PART) &&
!test_bit(GD_SUPPRESS_PART_SCAN, &disk->state))
set_bit(GD_NEED_PART_SCAN, &disk->state);

since the following disk_scan_partitions() in device_add_disk() should cover
partitions scan.

> ---
> block/genhd.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/block/genhd.c b/block/genhd.c
> index 08bb1a9ec22c..a72e27d6779d 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -368,7 +368,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> if (disk->open_partitions)
> return -EBUSY;
>
> - set_bit(GD_NEED_PART_SCAN, &disk->state);
> /*
> * If the device is opened exclusively by current thread already, it's
> * safe to scan partitons, otherwise, use bd_prepare_to_claim() to
> @@ -381,12 +380,19 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> return ret;
> }
>
> + set_bit(GD_NEED_PART_SCAN, &disk->state);
> bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
> if (IS_ERR(bdev))
> ret = PTR_ERR(bdev);
> else
> blkdev_put(bdev, mode & ~FMODE_EXCL);
>
> + /*
> + * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
> + * and this will cause that re-assemble partitioned raid device will
> + * creat partition for underlying disk.
> + */
> + clear_bit(GD_NEED_PART_SCAN, &disk->state);

I feel GD_NEED_PART_SCAN becomes a bit hard to follow.

So far, it is only consumed by blkdev_get_whole(), and cleared in
bdev_disk_changed(). That means partition scan can be retried
if bdev_disk_changed() fails.

Another mess is that more drivers start to touch this flag, such as
nbd/sd, probably it is better to change them into one API of
blk_disk_need_partition_scan(), and hide implementation detail
to drivers.


thanks,
Ming