Re: [RFC PATCH 1/1] x86/mm: Mark CoCo VM pages invalid while moving between private and shared

From: Edgecombe, Rick P
Date: Fri Sep 01 2023 - 12:34:34 EST


+Isaku

On Fri, 2023-09-01 at 14:44 +0000, Michael Kelley (LINUX) wrote:
> > Wait, since this does set_memory_np() as the first step for both
> > set_memory_encrypted() and set_memory_decrypted(), that pattern in
> > the
> > callers wouldn't work. I wonder if it should try to rollback itself
> > if
> > set_memory_np() fails (call set_memory_p() before returning the
> > error).
> > At least that will handle failures that happen on the guest side.
>
> Yes, I agree the error handling is very limited.  I'll try to make my
> patch cleanup properly if set_memory_np() fails as step 1.  In
> general,
> complete error cleanup on private <-> shared transitions looks to be
> pretty hard, and the original implementation obviously didn't deal
> with it.  For most of the steps in the sequence, a failure indicates
> something is pretty seriously broken with the CoCo aspects of the
> VM, and it's not clear that trying to clean up is likely to succeed
> or
> will make things any better. 

Ah I see. Direct map split failures are not totally unexpected though,
so the kernel should be able to handle that somewhat, like it does in
other places where set_memory() is used. I also wonder if the VMM might
need to split the EPT/NPT and fail in the same way, which would be a
somewhat normal situation.

And yes, I see that this is an existing problem, so don't mean to
suggest it should hold up this improvement.

It seems there are three ongoing improvements on these operations:
- Handling load_unaligned_zeropad()
- Make it work with vmalloc
- Remarking everything private when doing kexec

And then now I'm adding "lack of failure handling". The solutions for
each could affect the others, so I thought it might be worth
considering. I'm not very up to speed with the CoCo specifics here
though, so please take that part with a grain of salt.