Re: [PATCH v14 1/3] virt/coco/sev-guest: Add throttling awareness

From: Tom Lendacky
Date: Mon Feb 13 2023 - 16:43:38 EST


On 2/13/23 11:25, Dionna Glaze wrote:
The host is permitted and encouraged to throttle guest requests to the
AMD-SP since it is a shared resource across all VMs. Without
throttling-awareness, the host returning an error will immediately lock
out access to the VMPCK, which makes the VM less useful as it can't
attest itself. Since throttling is expected for a host to protect itself
from an uncooperative guest, a cooperative host can return a VMM error
code that the request was throttled.

The driver interprets the upper 32 bits of exitinfo2 as a VMM error code.
For safety, since the encryption algorithm in GHCBv2 is AES_GCM, control
must remain in the kernel to complete the request with the current
sequence number. Returning without finishing the request allows the
guest to make another request but with different message contents. This
is IV reuse, and breaks cryptographic protections.

A quick fix is to retry for a while and then disable the VMPCK and
return to user space.

A guest request may not make it to the AMD-SP before the host returns to
the guest, so the err local variable in handle_guest_request must be
initialized the same way fw_err is. snp_issue_guest_request similarly
should set fw_err whether or not the value is non-zero, in order to
appropriately clear the error value when zero.

The IV reuse fix for invalid certs_len needs modification to work with
throttling, since a single retry with a modified exit_code may be
throttled without retry and result in a locked-out VMPCK. Instead,
change the exit_code as before and jump to the same retry label, and
deal with the error code fixup by checking if the exit_code had to be
changed.

Another issue that must be fixed is how crypto results are written to
shared memory. The solution is to double-buffer messages.

This should really be a new, separate patch.


The encryption algorithms read and write directly to shared unencrypted
memory, which may leak information as well as permit the host to tamper
with the message integrity. Instead copy whole messages in or out as
needed before doing any computation on them.

Cc: Tom Lendacky <Thomas.Lendacky@xxxxxxx>
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Joerg Roedel <jroedel@xxxxxxx>
Cc: Peter Gonda <pgonda@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Borislav Petkov <Borislav.Petkov@xxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Venu Busireddy <venu.busireddy@xxxxxxxxxx>
Cc: Michael Roth <michael.roth@xxxxxxx>
Cc: "Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx>
Cc: Michael Sterritt <sterritt@xxxxxxxxxx>

Fixes: d5af44dde546 ("x86/sev: Provide support for SNP guest request
NAEs")

This shouldn't line wrap.


Signed-off-by: Dionna Glaze <dionnaglaze@xxxxxxxxxx>
---
arch/x86/include/asm/sev-common.h | 3 +-
arch/x86/kernel/sev.c | 3 +-
drivers/virt/coco/sev-guest/sev-guest.c | 54 +++++++++++++++++++++----
3 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index b8357d6ecd47..b63be696b776 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -128,8 +128,9 @@ struct snp_psc_desc {
struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
} __packed;
-/* Guest message request error code */
+/* Guest message request error codes */
#define SNP_GUEST_REQ_INVALID_LEN BIT_ULL(32)
+#define SNP_GUEST_REQ_ERR_BUSY BIT_ULL(33)
#define GHCB_MSR_TERM_REQ 0x100
#define GHCB_MSR_TERM_REASON_SET_POS 12
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 679026a640ef..a908ffc2dfba 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2212,14 +2212,13 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned
if (ret)
goto e_put;
+ *fw_err = ghcb->save.sw_exit_info_2;
if (ghcb->save.sw_exit_info_2) {
/* Number of expected pages are returned in RBX */
if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
ghcb->save.sw_exit_info_2 == SNP_GUEST_REQ_INVALID_LEN)
input->data_npages = ghcb_get_rbx(ghcb);
- *fw_err = ghcb->save.sw_exit_info_2;
-
ret = -EIO;
}
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 4ec4174e05a3..4945f2dd97a2 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -30,6 +30,7 @@
#define DEVICE_NAME "sev-guest"
#define AAD_LEN 48
#define MSG_HDR_VER 1
+#define ACCEPTABLE_REQUEST_RETRY_DURATION (60*HZ)
struct snp_guest_crypto {
struct crypto_aead *tfm;
@@ -43,7 +44,13 @@ struct snp_guest_dev {
void *certs_data;
struct snp_guest_crypto *crypto;
+ /* request and response are in unencrypted memory */
struct snp_guest_msg *request, *response;
+ /*
+ * Avoid information leakage by double-buffering shared messages
+ * in fields that are in regular encrypted memory.
+ */
+ struct snp_guest_msg secret_request, secret_response;
struct snp_secrets_page_layout *layout;
struct snp_req_data input;
u32 *os_area_msg_seqno;
@@ -263,14 +270,17 @@ static int dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, u32 sz)
{
struct snp_guest_crypto *crypto = snp_dev->crypto;
- struct snp_guest_msg *resp = snp_dev->response;
- struct snp_guest_msg *req = snp_dev->request;
+ struct snp_guest_msg *resp = &snp_dev->secret_response;
+ struct snp_guest_msg *req = &snp_dev->secret_request;
struct snp_guest_msg_hdr *req_hdr = &req->hdr;
struct snp_guest_msg_hdr *resp_hdr = &resp->hdr;
dev_dbg(snp_dev->dev, "response [seqno %lld type %d version %d sz %d]\n",
resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version, resp_hdr->msg_sz);
+ /* Copy response from shared memory to encrypted memory. */
+ memcpy(resp, snp_dev->response, sizeof(*resp));
+
/* Verify that the sequence counter is incremented by 1 */
if (unlikely(resp_hdr->msg_seqno != (req_hdr->msg_seqno + 1)))
return -EBADMSG;
@@ -294,7 +304,7 @@ static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload,
static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8 type,
void *payload, size_t sz)
{
- struct snp_guest_msg *req = snp_dev->request;
+ struct snp_guest_msg *req = &snp_dev->secret_request;
struct snp_guest_msg_hdr *hdr = &req->hdr;
memset(req, 0, sizeof(*req));
@@ -322,22 +332,34 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
u8 type, void *req_buf, size_t req_sz, void *resp_buf,
u32 resp_sz, __u64 *fw_err)
{
- unsigned long err;
+ unsigned long err = 0xff;
+ unsigned long start_time = jiffies;
+ u64 orig_exit_code = exit_code;
u64 seqno;
int rc;
+ unsigned int certs_npages = 0;
/* Get message sequence and verify that its a non-zero */
seqno = snp_get_msg_seqno(snp_dev);
if (!seqno)
return -EIO;
+ /* Clear shared memory's response for the host to populate. */
memset(snp_dev->response, 0, sizeof(struct snp_guest_msg));
- /* Encrypt the userspace provided payload */
+ /* Encrypt the userspace provided payload in snp_dev->secret_request. */
rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
if (rc)
return rc;
+ /*
+ * Write the fully encrypted request to the shared unencrypted
+ * request page.
+ */
+ memcpy(snp_dev->request, &snp_dev->secret_request,
+ sizeof(snp_dev->secret_request));
+
+retry:
/*
* Call firmware to process the request. In this function the encrypted
* message enters shared memory with the host. So after this call the
@@ -346,6 +368,20 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
*/
rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+ /*
+ * The host may return SNP_GUEST_REQ_ERR_EBUSY if the request has been
+ * throttled. Retry in the driver to avoid returning and reusing the
+ * message sequence number on a different message.
+ */
+ if (err == SNP_GUEST_REQ_ERR_BUSY) {
+ if (jiffies - start_time > ACCEPTABLE_REQUEST_RETRY_DURATION) {
+ rc = -ETIMEDOUT;
+ goto disable_vmpck;
+ }
+ cond_resched();
+ goto retry;

It looks like you will ensure throttling by continually calling the hypervisor for 60 seconds, shouldn't there be a delay here?

+ }
+
/*
* If the extended guest request fails due to having too small of a
* certificate data buffer, retry the same guest request without the
@@ -354,7 +390,7 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
*/
if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
err == SNP_GUEST_REQ_INVALID_LEN) {
- const unsigned int certs_npages = snp_dev->input.data_npages;
+ certs_npages = snp_dev->input.data_npages;
exit_code = SVM_VMGEXIT_GUEST_REQUEST;
@@ -366,8 +402,12 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
* of the VMPCK and the error code being propagated back to the
* user as an ioctl() return code.
*/
- rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+ cond_resched();
+ goto retry;
+ }

Nit, add a blank line here.

Thanks,
Tom

+ if (orig_exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
+ exit_code != orig_exit_code) {
/*
* Override the error to inform callers the given extended
* request buffer size was too small and give the caller the