Re: [PATCH 5/5] bus: mhi: host: Remove redundant dma_wmb() before ctx wp update

From: Hemant Kumar
Date: Fri May 06 2022 - 14:02:27 EST

Next message: kernel test robot: "Re: [PATCH v9] platform/chrome: Add ChromeOS ACPI device driver"
Previous message: Mike Kravetz: "Re: [PATCH 2/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when migration"
In reply to: Manivannan Sadhasivam: "Re: [PATCH 5/5] bus: mhi: host: Remove redundant dma_wmb() before ctx wp update"
Next in thread: Manivannan Sadhasivam: "[PATCH 4/5] bus: mhi: host: Rename parse_{rsc/xfer}_event to process{rsc/xfer}_event"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Loic,

On 5/6/2022 10:41 AM, Hemant Kumar wrote:

Hi Loic,

On 5/4/2022 8:58 AM, Manivannan Sadhasivam wrote:

On Wed, May 04, 2022 at 11:25:33AM +0200, Loic Poulain wrote:

On Wed, 4 May 2022 at 10:17, Manivannan Sadhasivam
<manivannan.sadhasivam@xxxxxxxxxx> wrote:

Hi Loic,

On Wed, May 04, 2022 at 09:21:20AM +0200, Loic Poulain wrote:

Hi Mani,

On Mon, 2 May 2022 at 12:42, Manivannan Sadhasivam
<manivannan.sadhasivam@xxxxxxxxxx> wrote:

The endpoint device will only read the context wp when the host rings
the doorbell.

Are we sure about this statement? what if we update ctxt_wp while the
device is still processing the previous ring? is it going to continue
processing the new ctxt_wp or wait for a new doorbell interrupt? what
about burst mode in which we don't ring at all (ring_db is no-op)?

Good point. I think my statement was misleading. But still this scenario won't
happen as per my undestanding. Please see below.

And moreover the doorbell write is using writel(). This
guarantess that the prior writes will be completed before ringing
doorbell.

Yes but the barrier is to ensure that descriptor/ring content is
updated before we actually pass it to device ownership, it's not about
ordering with the doorbell write, but the memory coherent ones.

I see a clear data dependency between writing the ring element and updating the
context pointer. For instance,

```
struct mhi_ring_element *mhi_tre;

mhi_tre = ring->wp;
/* Populate mhi_tre */
...

/* Increment wp */
ring->wp += el_size;

/* Update ctx wp */
ring->ctx_wp = ring->iommu_base + (ring->wp - ring->base);
```

This is analogous to:

```
Read PTR A;
Update PTR A;
Increment PTR A;
Write PTR A to PTR B;
```

Interesting point, but shouldn't it be more correct to translate it as:

1. Write PTR A to PTR B (mhi_tre);
2. Update PTR B DATA;
3. Increment PTR A;
4. Write PTR A to PTR C;

In that case, it looks like line 2. has no ordering constraint with 3.
& 4? whereas the following guarantee it:

1. Write PTR A to PTR B (mhi_tre);
2. Update PTR B DATA;
3. Increment PTR A;
dma_wmb()
4. Write PTR A to PTR C;

To be honest, compiler optimization is beyond my knowledge, so I don't
know if a specific compiler arch/version could be able to mess up the
sequence or not. But this pattern is really close to what is described
for dma_wmb() usage in Documentation/memory-barriers.txt. That's why I
challenged this change and would be conservative, keeping the explicit
barrier.

Hmm. Since I was reading the memory model and going through the MHI code, I
_thought_ that this dma_wmb() is redundant. But I missed the fact that the
updating to memory pointed by "wp" happens implicitly via a pointer. So that
won't qualify as a direct dependency.

Here, because of the data dependency due to "ring->wp", the CPU or compiler
won't be ordering the instructions. I think that's one of the reason we never
hit any issue due to this.

You may be right here about the implicit ordering guarantee... So if
you're sure, I think it would deserve an inline comment to explain why
we don't need a memory barrier as in the 'usual' dma descriptor update
sequences.

I think the barrier makes sense now. Sorry for the confusion and thanks for the
explanations.

Thanks,
Mani

Loic

You made a good point. After following your conversation, in case of burst mode is enabled and currently

we are in polling mode, does it make sense to move dma_wmb after updating channel WP context ?

DB ring is going to get skipped when we are in pilling mode.

instead of dma_wmb();
*ring->ctxt_wp = cpu_to_le64(db);

*ring->ctxt_wp = cpu_to_le64(db); dma_wmb();

Thanks,
Hemant

i think i spoke too fast. I think we dont need to worry about the polling mode as the context_wp update would happen at some point of time and that does not require dma_wmb after update context wp.

Thanks,
Hemant

Next message: kernel test robot: "Re: [PATCH v9] platform/chrome: Add ChromeOS ACPI device driver"
Previous message: Mike Kravetz: "Re: [PATCH 2/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when migration"
In reply to: Manivannan Sadhasivam: "Re: [PATCH 5/5] bus: mhi: host: Remove redundant dma_wmb() before ctx wp update"
Next in thread: Manivannan Sadhasivam: "[PATCH 4/5] bus: mhi: host: Rename parse_{rsc/xfer}_event to process{rsc/xfer}_event"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]