Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support

From: Jyothi Kumar Seerapu
Date: Thu Jun 19 2025 - 12:16:54 EST

Next message: Sean Anderson: "Re: [PATCH] driver core: Prevent deferred probe loops"
Previous message: Dev Jain: "Re: [PATCH 1/6] mm/selftests: Fix virtual_address_range test issues."
In reply to: Dmitry Baryshkov: "Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 6/18/2025 1:02 AM, Dmitry Baryshkov wrote:

On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
<quic_jseerapu@xxxxxxxxxxx> wrote:

On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:

On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:

On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:

On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:

On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:

On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:

Hi Dimitry, Thanks for providing the review comments.

On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:

On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:

The I2C driver gets an interrupt upon transfer completion.
When handling multiple messages in a single transfer, this
results in N interrupts for N messages, leading to significant
software interrupt latency.

To mitigate this latency, utilize Block Event Interrupt (BEI)
mechanism. Enabling BEI instructs the hardware to prevent interrupt
generation and BEI is disabled when an interrupt is necessary.

Large I2C transfer can be divided into chunks of 8 messages internally.
Interrupts are not expected for the first 7 message completions, only
the last message triggers an interrupt, indicating the completion of
8 messages. This BEI mechanism enhances overall transfer efficiency.

Why do you need this complexity? Is it possible to set the
DMA_PREP_INTERRUPT flag on the last message in the transfer?

If i undertsand correctly, the suggestion is to get the single
intetrrupt for last i2c message only.

But With this approach, we can't handle large number of i2c messages
in the transfer.

In GPI driver, number of max TREs support is harcoded to 64 (#define
CHAN_TRES 64) and for I2C message, we need Config TRE, GO TRE and
DMA TREs. So, the avilable TREs are not sufficient to handle all the
N messages.

It sounds like a DMA driver issue. In other words, the DMA driver can
know that it must issue an interrupt before exausting 64 TREs in order
to

Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
or 'num' incase for less messsages), process and unmap/free upon the
interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.

Why? This is some random value which has no connection with CHAN_TREs.
Also, what if one of the platforms get a 'liter' GPI which supports less
TREs in a single run? Or a super-premium platform which can use 256
TREs? Please don't workaround issues from one driver in another one.

We are trying to utilize the existing CHAN_TRES mentioned in the GPI driver.
With the following approach, the GPI hardware can process N number of I2C
messages, thereby improving throughput and transfer efficiency.

The main design consideration for using the block event interrupt is as
follows:

Allow the hardware to process the TREs (I2C messages), while the software
concurrently prepares the next set of TREs to be submitted to the hardware.
Once the TREs are processed, they can be freed, enabling the software to
queue new TREs. This approach enhances overall optimization.

Please let me know if you have any questions, concerns, or suggestions.

The question was why do you limit that to QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
What is the reason for that limit, etc. If you think about it, The GENI
/ I2C doesn't impose any limit on the number of messages processed in
one go (if I understand it correctly). Instead the limit comes from the
GPI DMA driver. As such, please don't add extra 'handling' to the I2C
driver. Make GPI DMA driver responsible for saying 'no more for now',
then I2C driver can setup add an interrupt flag and proceed with
submitting next messages, etc.

For I2C messages, we need to prepare TREs for Config, Go and DMAs. However,
if a large number of I2C messages are submitted then may may run out of
memory for serving the TREs. The GPI channel supports a maximum of 64 TREs,
which is insufficient to serve 32 or even 16 I2C messages concurrently,
given the multiple TREs required per message.

To address this limitation, a strategy has been implemented to manage how
many messages can be queued and how memory is recycled. The constant
QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
messages that can be queued at once. Additionally,
QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
half of the queued messages are expected to be freed or deallocated per
interrupt.
This approach ensures that the driver can efficiently manage TRE resources
and continue queuing new I2C messages without exhausting memory.

I really don't see a reason for additional complicated handling in the
geni driver that you've implemented. Maybe I misunderstand something. In
such a case it usually means that you have to explain the design in the
commit message / in-code comments.

The I2C Geni driver is designed to prepare and submit descriptors to the GPI
driver one message at a time.
As a result, the GPI driver does not have visibility into the current
message index or the total number of I2C messages in a transfer. This lack
of context makes it challenging to determine when to set the block event
interrupt, which is typically used to signal the completion of a batch of
messages.

So, the responsibility for deciding when to set the BEI should lie with the
I2C driver.

If this approach is acceptable, I will proceed with updating the relevant
details in the commit message.

Please let me know if you have any concerns or suggestions.

Hi Dmitry, Sorry for the delayed response, and thank you for the
suggestions.

- Make gpi_prep_slave_sg() return NULL if flags don't have
DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
interrupt-enabled transfer.

"there are no 3 empty TREs for the interrupt-enabled transfer."
Could you please help me understand this a bit better?

In the GPI driver you know how many TREs are available. In
gpi_prep_slave_sg() you can check that and return an error if there
are not enough TREs available.

- If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
again, adding DMA_PREP_INTERRUPT. Make sure that the last one always
gets DMA_PREP_INTERRUPT.

Does this mean we need to proceed to the next I2C message and ensure
that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
chunk? And then, should we submit the chunk of messages to the GSI
hardware for processing?

No. You don't have to peek at the next I2C message. This all concerns
the current I2C message. The only point where you have to worry is to
explicitly set the flag for the last message.

- In geni_i2c_gpi_xfer() split the loop to submit messages until you
can, then call wait_for_completion_timeout() and then
geni_i2c_gpi_unmap() for submitted messages, then continue with a new
portion of messages.

Since the GPI channel supports a maximum of 64 TREs, should we consider
submitting a smaller number of predefined messages — perhaps fewer than
32, such as 16?

Why? Just submit messages until they fit, then flush the DMA async channel.

This is because handling 32 messages would require one TRE for config
and 64 TREs for the Go and DMA preparation steps, which exceeds the
channel's TRE capacity of 64.

We designed the approach to submit a portion of the messages — for
example, 16 at a time. Once 8 messages are processed and freed, the
hardware can continue processing the TREs, while the software
simultaneously prepares the next set of TREs. This parallelism helps in
efficiently utilizing the hardware and enhances overall system
optimization.

And this overcomplicates the driver and introduces artificial
limitations which need explanation. Please fix it in a simple way
first. Then you can e.g. implement the watermark at the half of the
GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
middle of the full sequence, allowing it to be used asynchronously in
the background.

Okay, will review it. Thanks.

Next message: Sean Anderson: "Re: [PATCH] driver core: Prevent deferred probe loops"
Previous message: Dev Jain: "Re: [PATCH 1/6] mm/selftests: Fix virtual_address_range test issues."
In reply to: Dmitry Baryshkov: "Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]