Re: [PATCH v5 5/5] x86/kvm: Avoid dynamic allocation of pvclock data when SEV is active

From: Sean Christopherson
Date: Thu Sep 06 2018 - 11:54:55 EST


On Thu, Sep 06, 2018 at 05:19:38PM +0200, Borislav Petkov wrote:
> On Thu, Sep 06, 2018 at 07:56:40AM -0700, Sean Christopherson wrote:
> > Wouldn't that result in @hv_clock_boot being incorrectly freed in the
> > !SEV case?
>
> Ok, maybe I'm missing something but why do we need 4K per CPU? Why can't
> we map all those pages which contain the clock variable, decrypted in
> all guests' page tables?
>
> Basically
>
> (NR_CPUS * sizeof(struct pvclock_vsyscall_time_info)) / 4096
>
> pages.
>
> For the !SEV case then nothing changes.

The 4k per CPU refers to the dynamic allocation in Brijesh's original
patch. Currently, @hv_clock_boot is a single 4k page to limit the
amount of unused memory when 'nr_cpu_ids < NR_CPUS'. In the SEV case,
dynamically allocating for 'cpu > HVC_BOOT_ARRAY_SIZE' one at a time
means that each CPU allocates a full 4k page to store a single 32-byte
variable. My thought was that we could simply define a second array
for the SEV case to statically allocate for NR_CPUS since __decrypted
has a big chunk of memory that would be ununsed anyways[1]. And since
the second array is only used for SEV it can be freed if !SEV.

If we free the array explicitly then we don't need a second section or
attribute. My comments about __decrypted_exclusive were that if we
did want to go with a second section/attribute, e.g. to have a generic
solution that can be used for other stuff, then we'd have more corner
cases to deal with. I agree that simpler is better, i.e. I'd vote for
explicitly freeing the second array. Apologies for not making that
clear from the get-go.

[1] An alternative solution would be to batch the dynamic allocations,
but that would probably require locking and be more complex.