Re: [PATCH V6 01/10] Use copy_process in vhost layer

From: Christian Brauner
Date: Thu Dec 09 2021 - 04:32:35 EST


On Wed, Dec 08, 2021 at 04:13:27PM -0600, michael.christie@xxxxxxxxxx wrote:
> On 12/8/21 2:34 PM, Michael S. Tsirkin wrote:
> > On Mon, Nov 29, 2021 at 01:46:57PM -0600, Mike Christie wrote:
> >> The following patches made over Linus's tree, allow the vhost layer to do
> >> a copy_process on the thread that does the VHOST_SET_OWNER ioctl like how
> >> io_uring does a copy_process against its userspace app. This allows the
> >> vhost layer's worker threads to inherit cgroups, namespaces, address
> >> space, etc and this worker thread will also be accounted for against that
> >> owner/parent process's RLIMIT_NPROC limit.
> >>
> >> If you are not familiar with qemu and vhost here is more detailed
> >> problem description:
> >>
> >> Qemu will create vhost devices in the kernel which perform network, SCSI,
> >> etc IO and management operations from worker threads created by the
> >> kthread API. Because the kthread API does a copy_process on the kthreadd
> >> thread, the vhost layer has to use kthread_use_mm to access the Qemu
> >> thread's memory and cgroup_attach_task_all to add itself to the Qemu
> >> thread's cgroups.
> >>
> >> The problem with this approach is that we then have to add new functions/
> >> args/functionality for every thing we want to inherit. I started doing
> >> that here:
> >>
> >> https://urldefense.com/v3/__https://lkml.org/lkml/2021/6/23/1233__;!!ACWV5N9M2RV99hQ!ceUHd4m4MTJFOGccB9N5r7WonxVoYYT2XPiYwWt2-Vt1Y-DmQirRN8OqKozFLN1h73N6$
> >>
> >> for the RLIMIT_NPROC check, but it seems it might be easier to just
> >> inherit everything from the beginning, becuase I'd need to do something
> >> like that patch several times.
> >
> >
> > So who's merging this? Me? Did all patches get acked by appropriate
> > maintainers?
> >
>
> Not yet.
>
> Jens, The last open review comment was from you for the name change
> and additional patch description info.
>
> In this posting, I changed the name from:
>
> kernel_worker/kernel_worker_start
>
> to
>
> user_worker_create/user_worker_start
>
> I didn't do the start/create_user_worker format originally discussed
> because a while back Christoph had given me a review comment about trying
> to tie everything together into an API. Everything having the user_worker
> prefix felt nicer in that it was easy to tell the functions and flags went
> together, and so I thought it would handle his comment too.
>
> And patch:
>
> [PATCH V6 07/10] io_uring: switch to user_worker
>
> should better explain the reason for the switch.

I was waiting on Jens to give his review since io_uring is currently the
biggest users of this before the vhost switch.

Christian