Re: [PATCH] s390/vfio-ap: handle response code 01 on queue reset
From: Tony Krowiak
Date: Mon Dec 04 2023 - 09:54:25 EST
On 11/29/23 12:12, Christian Borntraeger wrote:
Am 29.11.23 um 15:35 schrieb Tony Krowiak:
In the current implementation, response code 01 (AP queue number not
valid)
is handled as a default case along with other response codes returned
from
a queue reset operation that are not handled specifically. Barring a bug,
response code 01 will occur only when a queue has been externally removed
from the host's AP configuration; nn this case, the queue must
be reset by the machine in order to avoid leaking crypto data if/when the
queue is returned to the host's configuration. The response code 01 case
will be handled specifically by logging a WARN message followed by
cleaning
up the IRQ resources.
To me it looks like this can be triggered by the LPAR admin, correct? So it
is not desireable but possible.
In that case I prefer to not use WARN, maybe use dev_warn or dev_err
instead.
WARN can be a disruptive event if panic_on_warn is set.
Yes, it can be triggered by the LPAR admin. I can't use dev_warn here
because we don't have a reference to any device, but I can use pr_warn
if that suffices.
Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
---
drivers/s390/crypto/vfio_ap_ops.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c
b/drivers/s390/crypto/vfio_ap_ops.c
index 4db538a55192..91d6334574d8 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1652,6 +1652,21 @@ static int apq_status_check(int apqn, struct
ap_queue_status *status)
* a value indicating a reset needs to be performed again.
*/
return -EAGAIN;
+ case AP_RESPONSE_Q_NOT_AVAIL:
+ /*
+ * This response code indicates the queue is not available.
+ * Barring a bug, response code 01 will occur only when a queue
+ * has been externally removed from the host's AP configuration;
+ * in which case, the queue must be reset by the machine in
+ * order to avoid leaking crypto data if/when the queue is
+ * returned to the host's configuration. In this case, let's go
+ * ahead and log a warning message and return 0 so the AQIC
+ * resources get cleaned up by the caller.
+ */
+ WARN(true,
+ "Unable to reset queue %02x.%04x: not in host AP
configuration\n",
+ AP_QID_CARD(apqn), AP_QID_QUEUE(apqn));
+ return 0;
default:
WARN(true,
"failed to verify reset of queue %02x.%04x: TAPQ rc=%u\n",
@@ -1736,6 +1751,22 @@ static void vfio_ap_mdev_reset_queue(struct
vfio_ap_queue *q)
q->reset_status.response_code = 0;
vfio_ap_free_aqic_resources(q);
break;
+ case AP_RESPONSE_Q_NOT_AVAIL:
+ /*
+ * This response code indicates the queue is not available.
+ * Barring a bug, response code 01 will occur only when a queue
+ * has been externally removed from the host's AP configuration;
+ * in which case, the queue must be reset by the machine in
+ * order to avoid leaking crypto data if/when the queue is
+ * returned to the host's configuration. In this case, let's go
+ * ahead and log a warning message then clean up the AQIC
+ * resources.
+ */
+ WARN(true,
+ "Unable to reset queue %02x.%04x: not in host AP
configuration\n",
+ AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn));
+ vfio_ap_free_aqic_resources(q);
+ break;
default:
WARN(true,
"PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n",