Re: [PATCH v3 03/25] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()

From: Kai Huang
Date: Mon Mar 15 2021 - 03:13:22 EST


On Sat, 13 Mar 2021 12:45:53 +0200 Jarkko Sakkinen wrote:
> On Fri, Mar 12, 2021 at 01:21:54PM -0800, Sean Christopherson wrote:
> > On Thu, Mar 11, 2021, Kai Huang wrote:
> > > From: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> > >
> > > EREMOVE takes a page and removes any association between that page and
> > > an enclave. It must be run on a page before it can be added into
> > > another enclave. Currently, EREMOVE is run as part of pages being freed
> > > into the SGX page allocator. It is not expected to fail.
> > >
> > > KVM does not track how guest pages are used, which means that SGX
> > > virtualization use of EREMOVE might fail.
> > >
> > > Break out the EREMOVE call from the SGX page allocator. This will allow
> > > the SGX virtualization code to use the allocator directly. (SGX/KVM
> > > will also introduce a more permissive EREMOVE helper).
> > >
> > > Implement original sgx_free_epc_page() as sgx_encl_free_epc_page() to be
> > > more specific that it is used to free EPC page assigned to one enclave.
> > > Print an error message when EREMOVE fails to explicitly call out EPC
> > > page is leaked, and requires machine reboot to get leaked pages back.
> > >
> > > Signed-off-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> > > Co-developed-by: Kai Huang <kai.huang@xxxxxxxxx>
> > > Acked-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> > > Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx>
> > > ---
> > > v2->v3:
> > >
> > > - Fixed bug during copy/paste which results in SECS page and va pages are not
> > > correctly freed in sgx_encl_release() (sorry for the mistake).
> > > - Added Jarkko's Acked-by.
> >
> > That Acked-by should either be dropped or moved above Co-developed-by to make
> > checkpatch happy.
> >
> > Reviewed-by: Sean Christopherson <seanjc@xxxxxxxxxx>
>
> Oops, my bad. Yup, ack should be removed.
>
> /Jarkko

Hi Jarkko,

Your reply of your concern of this patch to the cover-letter

https://lore.kernel.org/lkml/YEkJXu262YDa8ZaK@xxxxxxxxxx/

reminds me to do more sanity check of whether removing EREMOVE in
sgx_free_epc_page() will impact other code path or not, and I think
sgx_encl_release() is not the only place should be changed:

- sgx_encl_shrink() needs to call sgx_encl_free_epc_page(), since when this is
called, the VA page can be already valid -- there are other failures can
trigger sgx_encl_shrink().

- sgx_encl_add_page() should call sgx_encl_free_epc_page() in "err_out_free:"
label, since the EPC page can be already valid when error happened, i.e. when
EEXTEND fails.

Other places should be OK per my check, but I'd prefer to just replacing all
sgx_free_epc_page() call sites in driver with sgx_encl_free_epc_page(), with
one exception: sgx_alloc_va_page(), which calls sgx_free_epc_page() when EPA
fails, in which case EREMOVE is not required for sure.

Your idea, please?

Btw, introducing a driver wrapper of sgx_free_epc_page() does make sense to me,
because virtualization has a counterpart in sgx/virt.c too.