Re: [PATCH] kvm: arm: Fix handling of stage2 huge mappings

From: Marc Zyngier
Date: Wed Mar 20 2019 - 06:35:21 EST


On Wed, 20 Mar 2019 10:23:39 +0000
Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote:

Hi Suzuki,

> Marc,
>
> On 20/03/2019 10:11, Marc Zyngier wrote:
> > On Wed, 20 Mar 2019 09:44:38 +0000
> > Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote:
> >
> >> Hi Marc,
> >>
> >> On 20/03/2019 08:15, Marc Zyngier wrote:
> >>> Hi Suzuki,
> >>>
> >>> On Tue, 19 Mar 2019 14:11:08 +0000,
> >>> Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote:
> >>>>
> >>>> We rely on the mmu_notifier call backs to handle the split/merge
> >>>> of huge pages and thus we are guaranteed that, while creating a
> >>>> block mapping, either the entire block is unmapped at stage2 or it
> >>>> is missing permission.
> >>>>
> >>>> However, we miss a case where the block mapping is split for dirty
> >>>> logging case and then could later be made block mapping, if we cancel the
> >>>> dirty logging. This not only creates inconsistent TLB entries for
> >>>> the pages in the the block, but also leakes the table pages for
> >>>> PMD level.
> >>>>
> >>>> Handle this corner case for the huge mappings at stage2 by
> >>>> unmapping the non-huge mapping for the block. This could potentially
> >>>> release the upper level table. So we need to restart the table walk
> >>>> once we unmap the range.
> >>>>
> >>>> Fixes : ad361f093c1e31d ("KVM: ARM: Support hugetlbfs backed huge pages")
> >>>> Reported-by: Zheng Xiang <zhengxiang9@xxxxxxxxxx>
> >>>> Cc: Zheng Xiang <zhengxiang9@xxxxxxxxxx>
> >>>> Cc: Zhengui Yu <yuzenghui@xxxxxxxxxx>
> >>>> Cc: Marc Zyngier <marc.zyngier@xxxxxxx>
> >>>> Cc: Christoffer Dall <christoffer.dall@xxxxxxx>
> >>>> Signed-off-by: Suzuki K Poulose ...
>
>
> >>>> + if (!pmd_thp_or_huge(old_pmd)) {
> >>>> + unmap_stage2_range(kvm, addr & S2_PMD_MASK, S2_PMD_SIZE);
> >>>> + goto retry;
> >>>
>
> >>>> + if (!stage2_pud_huge(kvm, old_pud)) {
> >>>> + unmap_stage2_range(kvm, addr & S2_PUD_MASK, S2_PUD_SIZE);
> >>>
>
> >> We should really get rid of the S2_P{U/M}D_* definitions, as they are
> >> always the same as the host. The only thing that changes is the PGD size
> >> which varies according to the IPA and the concatenation.
> >>
>
> Also what do you think about using P{M,U}D_* instead of S2_P{M,U}D_*
> above ? I could make that change with the respin.

Given that this is a fix, I'd like it to be as small as obvious as
possible, making it easier to backport.

I'm happy to take another patch for 5.2 that will drop the whole S2_P*
if we still think that this should be the case (though what I'd really
like is to have architectural levels instead of these arbitrary
definitions).

Thanks,

M.
--
Without deviation from the norm, progress is not possible.