Re: [PATCH 03/27] drm/i915/gvt: Incorporate KVM memslot info into check for 2MiB GTT entry

From: Yan Zhao
Date: Thu Jan 19 2023 - 23:33:49 EST


On Thu, Jan 19, 2023 at 10:58:42AM +0800, Zhenyu Wang wrote:
> On 2023.01.11 17:55:04 +0000, Sean Christopherson wrote:
> > On Mon, Jan 09, 2023, Yan Zhao wrote:
> > > On Fri, Jan 06, 2023 at 11:01:53PM +0000, Sean Christopherson wrote:
> > > > On Fri, Jan 06, 2023, Yan Zhao wrote:
> > > > > On Thu, Jan 05, 2023 at 05:40:32PM +0000, Sean Christopherson wrote:
> > > > > > On Thu, Jan 05, 2023, Yan Zhao wrote:
> > > > > > I'm totally fine if KVMGT's ABI is that VFIO is the source of truth for mappings
> > > > > > and permissions, and that the only requirement for KVM memslots is that GTT page
> > > > > > tables need to be visible in KVM's memslots. But if that's the ABI, then
> > > > > > intel_gvt_is_valid_gfn() should be probing VFIO, not KVM (commit cc753fbe1ac4
> > > > > > ("drm/i915/gvt: validate gfn before set shadow page entry").
> > > > > >
> > > > > > In other words, pick either VFIO or KVM. Checking that X is valid according to
> > > > > > KVM and then mapping X through VFIO is confusing and makes assumptions about how
> > > > > > userspace configures KVM and VFIO. It works because QEMU always configures KVM
> > > > > > and VFIO as expected, but IMO it's unnecessarily fragile and again confusing for
> > > > > > unaware readers because the code is technically flawed.
> > > > > >
> > > > > Agreed.
> > > > > Then after some further thought, I think maybe we can just remove
> > > > > intel_gvt_is_valid_gfn() in KVMGT, because
> > > > >
> > > > > (1) both intel_gvt_is_valid_gfn() in emulate_ggtt_mmio_write() and
> > > > > ppgtt_populate_spt() are not for page track purpose, but to validate bogus
> > > > > GFN.
> > > > > (2) gvt_pin_guest_page() with gfn and size can do the validity checking,
> > > > > which is called in intel_gvt_dma_map_guest_page(). So, we can move the
> > > > > mapping of scratch page to the error path after intel_gvt_dma_map_guest_page().
> > > >
> > > > IIUC, that will re-introduce the problem commit cc753fbe1ac4 ("drm/i915/gvt: validate
> > > > gfn before set shadow page entry") solved by poking into KVM. Lack of pre-validation
> > > > means that bogus GFNs will trigger error messages, e.g.
> > > >
> > > > gvt_vgpu_err("vfio_pin_pages failed for iova %pad, ret %d\n",
> > > > &cur_iova, ret);
> > > >
> > > > and
> > > >
> > > > gvt_vgpu_err("fail to populate guest ggtt entry\n");
> > >
> > > Thanks for pointing it out.
> > > I checked this commit message and found below original intentions to introduce
> > > pre-validation:
> >
> > ...
> >
> > > (I actually found that the original code will print "invalid entry type"
> > > warning which indicates it's broken for a while due to lack of test in
> > > this invalid gfn path)
> > >
> > >
> > > > One thought would be to turn those printks into tracepoints to eliminate unwanted
> > > > noise, and to prevent the guest from spamming the host kernel log by programming
> > > > garbage into the GTT (gvt_vgpu_err() isn't ratelimited).
> > > As those printks would not happen in normal conditions and printks may have
> > > some advantages to discover the attack or bug, could we just convert
> > > gvt_vgpu_err() to be ratelimited ?
> >
> > That's ultimately a decision that needs to be made by the GVT maintainers, as the
> > answer depends on the use case. E.g. if most users of KVMGT run a single VM and
> > the guest user is also the host admin, then pr_err_ratelimited() is likely an
> > acceptable/preferable choice as there's a decent chance a human will see the errors
> > in the host kernel logs and be able to take action.
> >
> > But if there's unlikely to be a human monitoring the host logs, and/or the guest
> > user is unrelated to the host admin, then a ratelimited printk() is less useful.
> > E.g. if there's no one monitoring the logs, then losing messages due to
> > ratelimiting provides a worse debug experience overall than having to manually
> > enable tracepoints. And if there may be many tens of VMs (seems unlikely?), then
> > ratelimited printk() is even less useful because errors for a specific VM may be
> > lost, i.e. the printk() can't be relied upon in any way to detect issues.
> >
> > FWIW, in KVM proper, use of printk() to capture guest "errors" is strongly discourage
> > for exactly these reasons.
>
> Current KVMGT usage is mostly in controlled mode, either user is own host admin,
> or host admin would pre-configure specific limited number of VMs for KVMGT use.
> I think printk on error should be fine, we don't need rate limit, and adding
> extra trace monitor for admin might not be necessary. So I'm towards to keep to
> use current error message.
>

Thanks, Sean and Zhenyu.
So, could I just post the final fix as below?
And, Sean, would you like to include it in this series or should I send it out
first?