Re: [PATCH 2/2] time/namespace: Forbid timens page faults under kthread_use_mm()

From: Jann Horn
Date: Tue Nov 29 2022 - 17:29:01 EST


On Tue, Nov 29, 2022 at 10:18 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Tue, Nov 29 2022 at 20:18, Jann Horn wrote:
>
> > find_timens_vvar_page() doesn't work when current's timens does not match
> > the timens associated with current->mm.
> > v6 of the series adding this code [1] had some complicated code to deal
> > with this case, but v7 [2] removed that.
> >
> > Since the vvar region is designed to only be accessed by vDSO code, and
> > vDSO code can't run in kthread context, it should be fine to error out in
> > this case.
>
> Should? Either it is correct or not.
>
> But the way more interesting question is:
>
> > struct page *find_timens_vvar_page(struct vm_area_struct *vma)
> > {
> > + /*
> > + * We can't handle faults where current's timens does not match the
> > + * timens associated with the mm_struct. This can happen if a page fault
> > + * occurs in a kthread that is using kthread_use_mm().
> > + */
>
> How does a kthread, which obvioulsy did kthread_use_mm(), end up trying to
> fault in the time namespace vvar page?

By doing copy_from_user()? That's what kthread_use_mm() is for, right?
If you look through the users of kthread_use_mm(), most of them use it
to be able to use the normal usercopy functions. See the users in usb
gadget code, and the VFIO code, and the AMD GPU code. And if you're
doing usercopy on userspace addresses, then you can basically always
hit a vvar page - even if you had somehow checked beforehand what the
address points to, userspace could have moved a vvar region in that
spot in the meantime.

That said, I haven't actually tried it. But I don't think there's
anything in the page fault handling path that distinguishes between
copy_from_user() faults in kthread context and other userspace faults
in a relevant way?

> It's probably something nasty, but the changelog has a big information
> void.
>
> It neither answers the obvious question why this is a problem of the
> time namespace vvar page and not a general issue versus a kthread, which
> borrowed a user mm, ending up in vdso_fault() in the first place?

Is it a problem if a kthread ends up in the other parts of
vdso_fault() or vvar_fault()? From what I can tell, nothing in there
except for the timens stuff is going to care whether it's hit from a
userspace fault or from a kthread.

Though, looking at it again now, I guess the `sym_offset ==
image->sym_vvar_page` path is also going to misbehave, so I guess we
could try to instead make the vdso/vvar fault handlers bail out in
kthread context for all the architectures, since we're only going to
hit that if userspace is deliberately doing something bad...

> None of those VDSO (user space) addresses are subject to be faulted in
> by anything else than the associated user space task(s).

Are you saying that it's not possible or that it doesn't happen when
userspace is well-behaved?