On Wed, 14 Apr 2021 13:27:43 +0200 (CEST)
Miroslav Benes <mbenes@xxxxxxx> wrote:
Hi,It occurs IO error of the nbd device when I use livepatch of the
On Wed, 14 Apr 2021, xiaojun.zhao141@xxxxxxxxx wrote:
I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
nbd.qcow2) will automatically exit when I patched for functions of
the nbd with livepatch.
The nbd relative source:
static int nbd_start_device_ioctl(struct nbd_device *nbd, struct
block_device *bdev)
{ struct nbd_config *config =
nbd->config; int
ret;
ret =
nbd_start_device(nbd); if
(ret) return
ret;
if
(max_part) bdev->bd_invalidated =
1;
mutex_unlock(&nbd->config_lock); ret =
wait_event_interruptible(config->recv_wq,
atomic_read(&config->recv_threads) == 0); if
(ret)
sock_shutdown(nbd);
flush_workqueue(nbd->recv_workq);
mutex_lock(&nbd->config_lock);
nbd_bdev_reset(bdev);
/* user requested, ignore socket errors
*/ if (test_bit(NBD_RT_DISCONNECT_REQUESTED,
&config->runtime_flags)) ret =
0; if (test_bit(NBD_RT_TIMEDOUT,
&config->runtime_flags)) ret =
-ETIMEDOUT; return
ret; }
So my understanding is that ndb spawns a number
(config->recv_threads) of workqueue jobs and then waits for them to
finish. It waits interruptedly. Now, any signal would make
wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake
signal is no exception there. The error is then propagated back to
the userspace. Unless a user requested a disconnection or there is
timeout set. How does the userspace then reacts to it? Is
_interruptible there because the userspace sends a signal in case of
NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles
ordinary signals? This all sounds a bit strange, but I may be missing
something easily.
When the nbd waits for atomic_read(&config->recv_threads) == 0, the
klp will send a fake signal to it then the qemu-nbd process exits.
And the signal of sysfs to control this action was removed in the
commit 10b3d52790e 'livepatch: Remove signal sysfs attribute'. Are
there other ways to control this action? How?
No, there is no way currently. We send a fake signal automatically.
Regards
Miroslav
nbd, and I guess that any livepatch on other kernel source maybe cause
the IO error. Well, now I decide to workaround for this problem by
adding a livepatch for the klp to disable a automatic fake signal.