Re: [RFC PATCH V2 1/2] swiotlb: Add Child IO TLB mem support

From: hch@xxxxxx
Date: Tue May 31 2022 - 03:16:56 EST

Next message: Michal Hocko: "Re: [PATCH mm v3 0/9] memcg: accounting for objects allocated by mkdir cgroup"
Previous message: Eugenio Perez Martin: "Re: [PATCH v4 0/4] Implement vdpasim stop operation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, May 30, 2022 at 01:52:37AM +0000, Michael Kelley (LINUX) wrote:
> B) The contents of the memory buffer must transition between
> encrypted and not encrypted. The hardware doesn't provide
> any mechanism to do such a transition "in place". The only
> way to transition is for the CPU to copy the contents between
> an encrypted area and an unencrypted area of memory.
>
> Because of (B), we're stuck needing bounce buffers. There's no
> way to avoid them with the current hardware. Tianyu also pointed
> out not wanting to expose uninitialized guest memory to the host,
> so, for example, sharing a read buffer with the host requires that
> it first be initialized to zero.

Ok, B is a deal breaker. I just brought this in because I've received
review comments that state bouncing is just the easiest option for
now and we could map it into the hypervisor in the future. But at
least for SEV that does not seem like an option without hardware
changes.

> We should reset and make sure we agree on the top-level approach.
> 1) We want general scalability improvements to swiotlb. These
> improvements should scale to high CPUs counts (> 100) and for
> multiple NUMA nodes.
> 2) Drivers should not require any special knowledge of swiotlb to
> benefit from the improvements. No new swiotlb APIs should be
> need to be used by drivers -- the swiotlb scalability improvements
> should be transparent.
> 3) The scalability improvements should not be based on device
> boundaries since a single device may have multiple channels
> doing bounce buffering on multiple CPUs in parallel.

Agreed to all counts.

> The patch from Andi Kleen [3] (not submitted upstream, but referenced
> by Tianyu as the basis for his patches) seems like a good starting point
> for meeting the top-level approach.

Yes, I think doing per-cpu and/or per-node scaling sounds like the
right general approach. Why was this never sent out?

> Andi and Robin had some
> back-and-forth about Andi's patch that I haven't delved into yet, but
> getting that worked out seems like a better overall approach. I had
> an offline chat with Tianyu, and he would agree as well.

Where was this discussion?

Next message: Michal Hocko: "Re: [PATCH mm v3 0/9] memcg: accounting for objects allocated by mkdir cgroup"
Previous message: Eugenio Perez Martin: "Re: [PATCH v4 0/4] Implement vdpasim stop operation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]