Re: [PATCH v7 27/28] iommu/tegra241-cmdqv: Add user-space use support
From: Nicolin Chen
Date: Tue Jul 01 2025 - 16:23:52 EST
On Tue, Jul 01, 2025 at 08:03:35PM +0000, Pranjal Shrivastava wrote:
> On Tue, Jul 01, 2025 at 12:42:32PM -0700, Nicolin Chen wrote:
> > On Tue, Jul 01, 2025 at 04:02:35PM +0000, Pranjal Shrivastava wrote:
> > > On Thu, Jun 26, 2025 at 12:34:58PM -0700, Nicolin Chen wrote:
> > > > +/**
> > > > + * struct tegra241_vintf_sid - Virtual Interface Stream ID Replacement
> > > > + * @core: Embedded iommufd_vdevice structure, holding virtual Stream ID
> > > > + * @vintf: Parent VINTF pointer
> > > > + * @sid: Physical Stream ID
> > > > + * @idx: Replacement index in the VINTF
> > > > + */
> > > > +struct tegra241_vintf_sid {
> > > > + struct iommufd_vdevice core;
> > > > + struct tegra241_vintf *vintf;
> > > > + u32 sid;
> > > > + u8 idx;
> > > > };
> > >
> > > AFAIU, This seems to be a handle for sid -> vintf mapping.. it yes, then
> > > I'm not sure if "Virtual Interface Stream ID Replacement" clarifies that?
> >
> > No. It's for vSID to pSID mappings. I had it explained in commit log:
> >
>
> I get that, it's for vSID -> pSID mapping which also "happens to" point
> to the vintf.. all I wanted to say was that the description is unclear..
> We could've described it as "Vintf SID map" or something, but I guess
> it's fine the way it is too.. your call.
The "replace" word is borrowed from the "SID_REPLACE" HW register.
But I think it's okay to call it just "mapping", if that makes it
clearer.
> > > > +static struct iommufd_viommu_ops tegra241_cmdqv_viommu_ops = {
> > > > + .destroy = tegra241_cmdqv_destroy_vintf_user,
> > > > + .alloc_domain_nested = arm_vsmmu_alloc_domain_nested,
> > > > + .cache_invalidate = arm_vsmmu_cache_invalidate,
> > >
> > > I see that we currently use the main cmdq to issue these cache
> > > invalidations (there's a FIXME in arm_vsmmu_cache_invalidate). I was
> > > hoping for this series to change that but I'm assuming there's another
> > > series coming for that?
> > >
> > > Meanwhile, I guess it'd be good to call that out for folks who have
> > > Grace and start trying out this feature.. I'm assuming they won't see
> > > as much perf improvement with this series alone since we're still using
> > > the main CMDQ in the upstream code?
> >
> > VCMDQ only accelerates invalidation commands.
> >
>
> I get that.. but I see we're using `arm_vsmmu_cache_invalidate` here
> from arm-smmu-v3-iommufd.c which seems to issue all commands to
> smmu->cmdq as of now (the code has a FIXME as well), per the code:
>
> /* FIXME always uses the main cmdq rather than trying to group by type */
> ret = arm_smmu_cmdq_issue_cmdlist(smmu, &smmu->cmdq, last->cmd,
> cur - last, true);
>
> I was hoping this FIXME to be addressed in this series..
Oh, that's not related.
The main goal of this series is to route all invalidation commands
to the VCMDQ HW. And this is where Grace users can see perf gains
mentioned in the cover letter or commit log, from eliminating the
VM Exits at those most frequently used commands.
Any non-invalidation commands will just reuse what we have with the
standard SMMU nesting. And even if we did something to that FIXME,
there is no significant perf gain as it's going down the trapping
pathway, i.e. the VM Exits are always there.
> > That is for non-invalidation commands that VCMDQ doesn't support,
> > so they still have to go in the standard nesting pathway.
> >
> > Let's add a line:
> > /* for non-invalidation commands use */
>
> Umm.. I was talking about the cache_invalidate op? I think there's some
> misunderstanding here? What am I missing?
That line is exactly for cache_invalidate. All the non-invalidation
commands will be sent to he arm_vsmmu_cache_invalidate() by VMM, as
it means.
Or perhaps calling them "non-accelerated commands" would be nicer.
Thanks
Nicolin