Re: [PATCH] ocxl: Allow contexts to be attached with a NULL mm

From: Frederic Barrat
Date: Wed Jun 19 2019 - 06:41:33 EST




Le 18/06/2019 Ã 03:50, Andrew Donnellan a ÃcritÂ:
On 17/6/19 2:41 pm, Alastair D'Silva wrote:
From: Alastair D'Silva <alastair@xxxxxxxxxxx>

If an OpenCAPI context is to be used directly by a kernel driver, there
may not be a suitable mm to use.

The patch makes the mm parameter to ocxl_context_attach optional.

Signed-off-by: Alastair D'Silva <alastair@xxxxxxxxxxx>

The one issue I can see here is that using mm == NULL bypasses our method of enabling/disabling global TLBIs in mm_context_add_copro().

Discussing this privately with Alastair and Fred - this should be fine, but perhaps we should document that.


So indeed we should be fine. I confirmed with Nick that kernel space invalidations are already global today.
Nick mentioned that we should still be fine tomorrow, but in the distant future, we could imagine local usage of some part of the kernel space. It will require some work, but it would be best to add a comment in one of the kernel invalidation function (for example radix__flush_tlb_kernel_range()) that if a kernel invalidation ever becomes local, then clients of the nest MMU may need some work.

A few more comments below.


---
 drivers/misc/ocxl/context.c | 9 ++++++---
 drivers/misc/ocxl/link.c | 12 ++++++++----
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c
index bab9c9364184..994563a078eb 100644
--- a/drivers/misc/ocxl/context.c
+++ b/drivers/misc/ocxl/context.c
@@ -69,6 +69,7 @@ static void xsl_fault_error(void *data, u64 addr, u64 dsisr)
 int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm)
 {
ÂÂÂÂÂ int rc;
+ÂÂÂ unsigned long pidr = 0;
ÂÂÂÂÂ // Locks both status & tidr
ÂÂÂÂÂ mutex_lock(&ctx->status_mutex);
@@ -77,9 +78,11 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm)
ÂÂÂÂÂÂÂÂÂ goto out;
ÂÂÂÂÂ }
-ÂÂÂ rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid,
-ÂÂÂÂÂÂÂÂÂÂÂ mm->context.id, ctx->tidr, amr, mm,
-ÂÂÂÂÂÂÂÂÂÂÂ xsl_fault_error, ctx);
+ÂÂÂ if (mm)
+ÂÂÂÂÂÂÂ pidr = mm->context.id;
+
+ÂÂÂ rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, pidr, ctx->tidr,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ amr, mm, xsl_fault_error, ctx);
ÂÂÂÂÂ if (rc)
ÂÂÂÂÂÂÂÂÂ goto out;
diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
index cce5b0d64505..43542f124807 100644
--- a/drivers/misc/ocxl/link.c
+++ b/drivers/misc/ocxl/link.c
@@ -523,7 +523,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
ÂÂÂÂÂ pe->amr = cpu_to_be64(amr);
ÂÂÂÂÂ pe->software_state = cpu_to_be32(SPA_PE_VALID);
-ÂÂÂ mm_context_add_copro(mm);
+ÂÂÂ if (mm)
+ÂÂÂÂÂÂÂ mm_context_add_copro(mm);


Same as above, we should add a comment here in the driver code that a kernel context is ok because invalidations are global.


We also need a new check in xsl_fault_handler(). A valid kernel address shouldn't fault, but it's still possible for the FPGA to try accessing a bogus kernel address. In which case, xsl_fault_handler() would be entered, with a valid fault context. We'll find pe_data in the tree based on the valid pe_handle, but pe_data->mm will be NULL. In that, we can return early, acknowledging the interrupt with ADDRESS_ERROR value (like we do if pe_data is not found in the tree).

Fred


ÂÂÂÂÂ /*
ÂÂÂÂÂÂ * Barrier is to make sure PE is visible in the SPA before it
ÂÂÂÂÂÂ * is used by the device. It also helps with the global TLBI
@@ -546,7 +547,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
ÂÂÂÂÂÂ * have a reference on mm_users. Incrementing mm_count solves
ÂÂÂÂÂÂ * the problem.
ÂÂÂÂÂÂ */
-ÂÂÂ mmgrab(mm);
+ÂÂÂ if (mm)
+ÂÂÂÂÂÂÂ mmgrab(mm);
ÂÂÂÂÂ trace_ocxl_context_add(current->pid, spa->spa_mem, pasid, pidr, tidr);
 unlock:
ÂÂÂÂÂ mutex_unlock(&spa->spa_lock);
@@ -652,8 +654,10 @@ int ocxl_link_remove_pe(void *link_handle, int pasid)
ÂÂÂÂÂ if (!pe_data) {
ÂÂÂÂÂÂÂÂÂ WARN(1, "Couldn't find pe data when removing PE\n");
ÂÂÂÂÂ } else {
-ÂÂÂÂÂÂÂ mm_context_remove_copro(pe_data->mm);
-ÂÂÂÂÂÂÂ mmdrop(pe_data->mm);
+ÂÂÂÂÂÂÂ if (pe_data->mm) {
+ÂÂÂÂÂÂÂÂÂÂÂ mm_context_remove_copro(pe_data->mm);
+ÂÂÂÂÂÂÂÂÂÂÂ mmdrop(pe_data->mm);
+ÂÂÂÂÂÂÂ }
ÂÂÂÂÂÂÂÂÂ kfree_rcu(pe_data, rcu);
ÂÂÂÂÂ }
 unlock: