Re: the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks'

From: xiaojun . zhao141
Date: Wed Apr 14 2021 - 11:21:26 EST


On Wed, 14 Apr 2021 13:27:43 +0200 (CEST)
Miroslav Benes <mbenes@xxxxxxx> wrote:

> Hi,
>
> On Wed, 14 Apr 2021, xiaojun.zhao141@xxxxxxxxx wrote:
>
> > I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
> > nbd.qcow2) will automatically exit when I patched for functions of
> > the nbd with livepatch.
> >
> > The nbd relative source:
> > static int nbd_start_device_ioctl(struct nbd_device *nbd, struct
> > block_device *bdev)
> > { struct nbd_config *config =
> > nbd->config; int
> > ret;
> > ret =
> > nbd_start_device(nbd); if
> > (ret) return
> > ret;
> > if
> > (max_part) bdev->bd_invalidated =
> > 1;
> > mutex_unlock(&nbd->config_lock); ret =
> > wait_event_interruptible(config->recv_wq,
> > atomic_read(&config->recv_threads) == 0); if
> > (ret)
> > sock_shutdown(nbd);
> > flush_workqueue(nbd->recv_workq);
> > mutex_lock(&nbd->config_lock);
> > nbd_bdev_reset(bdev);
> > /* user requested, ignore socket errors
> > */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED,
> > &config->runtime_flags)) ret =
> > 0; if (test_bit(NBD_RT_TIMEDOUT,
> > &config->runtime_flags)) ret =
> > -ETIMEDOUT; return
> > ret; }
>
> So my understanding is that ndb spawns a number
> (config->recv_threads) of workqueue jobs and then waits for them to
> finish. It waits interruptedly. Now, any signal would make
> wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake
> signal is no exception there. The error is then propagated back to
> the userspace. Unless a user requested a disconnection or there is
> timeout set. How does the userspace then reacts to it? Is
> _interruptible there because the userspace sends a signal in case of
> NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles
> ordinary signals? This all sounds a bit strange, but I may be missing
> something easily.
>
> > When the nbd waits for atomic_read(&config->recv_threads) == 0, the
> > klp will send a fake signal to it then the qemu-nbd process exits.
> > And the signal of sysfs to control this action was removed in the
> > commit 10b3d52790e 'livepatch: Remove signal sysfs attribute'. Are
> > there other ways to control this action? How?
>
> No, there is no way currently. We send a fake signal automatically.
>
> Regards
> Miroslav
It occurs IO error of the nbd device when I use livepatch of the
nbd, and I guess that any livepatch on other kernel source maybe cause
the IO error. Well, now I decide to workaround for this problem by
adding a livepatch for the klp to disable a automatic fake signal.

Regards.