Re: [PATCH v1.1 1/2] x86/sev: Use per-CPU PSC structure in prep for unaccepted memory support

From: Tom Lendacky
Date: Wed Aug 03 2022 - 17:35:05 EST


On 8/3/22 16:18, Dave Hansen wrote:
On 8/3/22 14:03, Tom Lendacky wrote:
This whole iteration does look good to me versus the per-cpu version, so
I say go ahead with doing this for v2 once you wait a bit for any more
feedback.

I'm still concerned about the whole spinlock and performance. What if I
reduce the number of entries in the PSC structure to, say, 64, which
reduces the size of the struct to 520 bytes. Any issue if that is put on
the stack, instead? It definitely makes things less complicated and
feels like a good compromise on the size vs the number of PSC VMGEXIT
requests.

That would be fine too.

Ok.


But, I doubt there will be any real performance issues coming out of
this. As bad as this MSR thing is, I suspect it's not half as
disastrous as the global spinlock in Kirill's patches.

Also, private<->shared page conversions are *NOT* common from what I can
tell. There are a few pages converted at boot, but most host the
guest<->host communications are through the swiotlb pages which are static.

Generally, that's true. But, e.g., a dma_alloc_coherent() actually doesn't go through SWIOTLB, but instead allocates the pages and makes them shared, which results in a page state change. The NVMe driver was calling that API a lot. In this case, though, the NVMe driver was running in IRQ context and set_memory_decrypted() could sleep, so an unencrypted DMA memory pool was created to work around the sleeping issue and reduce the page state changes. It's just things like that, that make me wary.

Thanks,
Tom


Are there other things that SEV uses this structure for that I'm missing?