On Thu, Jul 17, 2025 at 05:11:50PM +0800, Zizhi Wo wrote:
Currently, we have the following test scenario:
disk_container=$(
${docker} run...kata-runtime...io.kubernets.docker.type=container...
)
docker_id=$(
${docker} run...kata-runtime...io.kubernets.docker.type=container...
io.katacontainers.disk_share="{"src":"/dev/sdb","dest":"/dev/test"}"...
)
${docker} stop "$disk_container"
${docker} exec "$docker_id" mount /dev/test /tmp -->success!!
When the "disk_container" is started, a series of block devices are
created. During the startup of "docker_id", /dev/test is created using
mknod. After "disk_container" is stopped, the created sda/sdb/sdc disks
are deleted, but mounting /dev/test still succeeds.
The reason is that runc calls unshare, which triggers clone_mnt(),
increasing the "sb->s_active" reference count. As long as the "docker_id"
does not exit, the superblock still has a reference count.
So when mounting, the old superblock is reused in sget_fc(), and the mount
succeeds, even if the actual device no longer exists. The whole process can
be simplified as follows:
mkfs.ext4 -F /dev/sdb
mount /dev/sdb /mnt
mknod /dev/test b 8 16 # [sdb 8:16]
echo 1 > /sys/block/sdb/device/delete
mount /dev/test /mnt1 # -> mount success
The overall change was introduced by: aca740cecbe5 ("fs: open block device
after superblock creation"). Previously, we would open the block device
once. Now, if the old superblock can be reused, the block device won't be
opened again.
Would it be possible to additionally open the block device in read-only
mode in super_s_dev_test() for verification? Or is there any better way to
avoid this issue?
As long as you use the new mount api you should pass
FSCONFIG_CMD_CREATE_EXCL which will refuse to mount if a superblock for
the device already exists. IOW, it ensure that you cannot silently reuse
a superblock.
Other than that I think a blkdev_get_no_open(dev, false) after
lookup_bdev() should sort the issue out. Christoph?