Re: [PATCH] iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page

From: Lu Baolu
Date: Wed Feb 19 2020 - 21:35:23 EST


Hi Yonghyun,

On 2020/2/20 8:12, Yonghyun Hwang wrote:
Hello Baolu,

Yes, pfn_to_dma_pte() doesn't put 0 for level. However, it's hard for
me to get that without comment, which is the reason why I put a dummy
"if" there. Anyway, what about the following code?

static inline unsigned long offset_mask(int level)
{
return BIT_MASK(level_to_offset_bits(level) + VTD_PAGE_SHIFT) - 1;
}

This helper is unnecessary since it's only for the iova_to_phys(). I'd
like to put it in the code directly. Other looks good to me.

By the way, can you please rebase your patch on the top of the mainline
kernel tree and add a patch version if you are willing to send a new
version of this patch?

Best regards,
baolu


static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
dma_addr_t iova)
{
struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct dma_pte *pte;
int level = 0;
u64 phys = 0;

if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
return 0;

pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level);
if (pte)
phys = dma_pte_addr(pte) + (iova & offset_mask(level));

return phys;
}

On Wed, Feb 19, 2020 at 3:33 PM Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:

Hi,

On 2020/2/20 2:51, Yonghyun Hwang wrote:
Hello Lu and Moritz,

I think it's better to keep "if (level > 1)" because
level_to_offset_bits() returns (unsigned int) -1 * LEVEL_STRIDE if level
becomes 0.

@level will never be 0 in this case.

Best regards,
baolu


static inline unsigned int level_to_offset_bits(int level)
{
return (level - 1) * LEVEL_STRIDE;
}

On Tue, Feb 18, 2020 at 8:42 PM Moritz Fischer <mdf@xxxxxxxxxx
<mailto:mdf@xxxxxxxxxx>> wrote:

Hi Baolu, Yonghyun

On Wed, Feb 19, 2020 at 11:15:36AM +0800, Lu Baolu wrote:
> Hi Yonghyun,
>
> Thanks for the patch.
>
> On 2020/2/19 6:23, Yonghyun Hwang wrote:
> > intel_iommu_iova_to_phys() has a bug when it translates an IOVA
for a huge
> > page onto its corresponding physical address. This commit fixes
the bug by
> > accomodating the level of page entry for the IOVA and adds
IOVA's lower
> > address to the physical address. >
> > Signed-off-by: Yonghyun Hwang <yonghyun@xxxxxxxxxx
<mailto:yonghyun@xxxxxxxxxx>>
> > ---
> > drivers/iommu/intel-iommu.c | 11 +++++++++--
> > 1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/iommu/intel-iommu.c
b/drivers/iommu/intel-iommu.c
> > index 0c8d81f56a30..ed6e69adb578 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -5555,13 +5555,20 @@ static phys_addr_t
intel_iommu_iova_to_phys(struct iommu_domain *domain,
> > struct dma_pte *pte;
> > int level = 0;
> > u64 phys = 0;
> > + const unsigned long pfn = iova >> VTD_PAGE_SHIFT;
>
> Why do you need a "const unsigned long" here?



It's because pfn_to_dma_pte() takes pfn for its second parameter as
"unsigned int" type and pfn is not changed in intel_iommu_iova_to_phys().

>
> > if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
> > return 0;
> > - pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT,
&level);
> > - if (pte)
> > + pte = pfn_to_dma_pte(dmar_domain, pfn, &level);
> > + if (pte) {
> > phys = dma_pte_addr(pte);
> > + if (level > 1)
> > + phys += (pfn &
> > + ((1UL <<
level_to_offset_bits(level)) - 1))
> > + << VTD_PAGE_SHIFT;
> > + phys += iova & (VTD_PAGE_SIZE - 1);
>
> How about

>
> diff --git a/drivers/iommu/intel-iommu.c
b/drivers/iommu/intel-iommu.c
> index 9dc37672bf89..bd17c2510bb2 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5693,8 +5693,14 @@ static phys_addr_t
intel_iommu_iova_to_phys(struct
> iommu_domain *domain,
> u64 phys = 0;
>
> pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT,
&level);
> - if (pte)
> + if (pte) {
> + unsigned long offset_mask;
> +
> + offset_mask = BIT_MASK(level_to_offset_bits(level) +
> + VTD_PAGE_SHIFT) - 1;
> phys = dma_pte_addr(pte);
> + phys += iova & (bitmask - 1);
Did you mean:

phys += iova & (offset_mask - 1);

> + }

At that point why not define a helper:

static inline unsigned long offset_mask(int level)
{
return BIT_MASK(level_to_offset_bits(level) +
VTD_PAGE_SHIFT) - 1;
}

At which point the above would reduce to:

if (pte)
phys = dma_pte_addr(pte) + iova & offset_mask(level) - 1;



Let me introduce the helper function and upstream the patch again. Thank
you Lu and Moritz.


I might've fatfingered something here, but you get the idea :)

Cheers,
Moritz


On Tue, Feb 18, 2020 at 8:42 PM Moritz Fischer <mdf@xxxxxxxxxx
<mailto:mdf@xxxxxxxxxx>> wrote:

Hi Baolu, Yonghyun

On Wed, Feb 19, 2020 at 11:15:36AM +0800, Lu Baolu wrote:
> Hi Yonghyun,
>
> Thanks for the patch.
>
> On 2020/2/19 6:23, Yonghyun Hwang wrote:
> > intel_iommu_iova_to_phys() has a bug when it translates an IOVA
for a huge
> > page onto its corresponding physical address. This commit fixes
the bug by
> > accomodating the level of page entry for the IOVA and adds
IOVA's lower
> > address to the physical address. >
> > Signed-off-by: Yonghyun Hwang <yonghyun@xxxxxxxxxx
<mailto:yonghyun@xxxxxxxxxx>>
> > ---
> > drivers/iommu/intel-iommu.c | 11 +++++++++--
> > 1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/iommu/intel-iommu.c
b/drivers/iommu/intel-iommu.c
> > index 0c8d81f56a30..ed6e69adb578 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -5555,13 +5555,20 @@ static phys_addr_t
intel_iommu_iova_to_phys(struct iommu_domain *domain,
> > struct dma_pte *pte;
> > int level = 0;
> > u64 phys = 0;
> > + const unsigned long pfn = iova >> VTD_PAGE_SHIFT;
>
> Why do you need a "const unsigned long" here?
>
> > if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
> > return 0;
> > - pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT,
&level);
> > - if (pte)
> > + pte = pfn_to_dma_pte(dmar_domain, pfn, &level);
> > + if (pte) {
> > phys = dma_pte_addr(pte);
> > + if (level > 1)
> > + phys += (pfn &
> > + ((1UL <<
level_to_offset_bits(level)) - 1))
> > + << VTD_PAGE_SHIFT;
> > + phys += iova & (VTD_PAGE_SIZE - 1);
>
> How about

>
> diff --git a/drivers/iommu/intel-iommu.c
b/drivers/iommu/intel-iommu.c
> index 9dc37672bf89..bd17c2510bb2 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5693,8 +5693,14 @@ static phys_addr_t
intel_iommu_iova_to_phys(struct
> iommu_domain *domain,
> u64 phys = 0;
>
> pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT,
&level);
> - if (pte)
> + if (pte) {
> + unsigned long offset_mask;
> +
> + offset_mask = BIT_MASK(level_to_offset_bits(level) +
> + VTD_PAGE_SHIFT) - 1;
> phys = dma_pte_addr(pte);
> + phys += iova & (bitmask - 1);
Did you mean:

phys += iova & (offset_mask - 1);

> + }

At that point why not define a helper:

static inline unsigned long offset_mask(int level)
{
return BIT_MASK(level_to_offset_bits(level) +
VTD_PAGE_SHIFT) - 1;
}

At which point the above would reduce to:

if (pte)
phys = dma_pte_addr(pte) + iova & offset_mask(level) - 1;

I might've fatfingered something here, but you get the idea :)

Cheers,
Moritz