Re: [PATCH v4] ucsi_ccg: Refine the UCSI Interrupt handling

From: Jon Hunter
Date: Tue Feb 21 2023 - 11:41:51 EST


Hi Greg,

On 19/01/2023 12:28, Greg Kroah-Hartman wrote:
On Wed, Jan 18, 2023 at 02:15:23PM +0800, Haotien Hsu wrote:
From: Sing-Han Chen <singhanc@xxxxxxxxxx>

For the CCGx, when the OPM field in the INTR_REG is cleared, then the
CCI data in the PPM is reset.

To align with the CCGx UCSI interface guide, this patch updates the
driver to copy CCI and MESSAGE_IN before clearing UCSI interrupt.
When a new command is sent, the driver will clear the old CCI and
MESSAGE_IN copy.

Finally, clear UCSI_READ_INT before calling complete() to ensure that
the ucsi_ccg_sync_write() would wait for the interrupt handling to
complete.
It prevents the driver from resetting CCI prematurely.

Signed-off-by: Sing-Han Chen <singhanc@xxxxxxxxxx>
Signed-off-by: Haotien Hsu <haotienh@xxxxxxxxxx>
---
V1->V2
- Fix uninitialized symbol 'cci'
v2->v3
- Remove misusing Reported-by tags
v3->v4
- Add comments for op_lock
---
drivers/usb/typec/ucsi/ucsi_ccg.c | 90 ++++++++++++++++++++++++++++---
1 file changed, 83 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/typec/ucsi/ucsi_ccg.c b/drivers/usb/typec/ucsi/ucsi_ccg.c
index eab3012e1b01..532813a32cc1 100644
--- a/drivers/usb/typec/ucsi/ucsi_ccg.c
+++ b/drivers/usb/typec/ucsi/ucsi_ccg.c
@@ -192,6 +192,12 @@ struct ucsi_ccg_altmode {
bool checked;
} __packed;
+#define CCGX_MESSAGE_IN_MAX 4
+struct op_region {
+ u32 cci;

This is coming from hardware so you have to specify the endian-ness of
it, right?

The current driver reads the 'cci' state in the ccg_irq_handler and here we just pass a variable of type u32 for storing the state. We are just adding variable of the same type to save the state. This value is returned to the ucsi layer which does not specify the endian-ness either. I guess this driver like many assume little endian. What is the guidance here? Should we be adding __le32 here even if the upper layers don't?

+ u32 message_in[CCGX_MESSAGE_IN_MAX];

Same here.

+};
+
struct ucsi_ccg {
struct device *dev;
struct ucsi *ucsi;
@@ -222,6 +228,13 @@ struct ucsi_ccg {
bool has_multiple_dp;
struct ucsi_ccg_altmode orig[UCSI_MAX_ALTMODES];
struct ucsi_ccg_altmode updated[UCSI_MAX_ALTMODES];
+
+ /*
+ * This spinlock protects op_data which includes CCI and MESSAGE_IN that
+ * will be updated in ISR
+ */
+ spinlock_t op_lock;
+ struct op_region op_data;
};
static int ccg_read(struct ucsi_ccg *uc, u16 rab, u8 *data, u32 len)
@@ -305,12 +318,57 @@ static int ccg_write(struct ucsi_ccg *uc, u16 rab, const u8 *data, u32 len)
return 0;
}
+static void ccg_op_region_read(struct ucsi_ccg *uc, unsigned int offset,
+ void *val, size_t val_len)
+{
+ struct op_region *data = &uc->op_data;
+
+ spin_lock(&uc->op_lock);
+ if (offset == UCSI_CCI)
+ memcpy(val, &data->cci, val_len);
+ else if (offset == UCSI_MESSAGE_IN)
+ memcpy(val, &data->message_in, val_len);

What happens if the offset is neither of these?

Looking at where this is called, currently only these offsets are passed to this function. However, I am wondering if we really need this function and if we just don't collapse this into ucsi_ccg_read() so we have ...

if (offset == UCSI_CCI) {
spin_lock(&uc->op_lock);
memcpy(val, &uc->op_data.cci, val_len);
spin_unlock(&uc->op_lock);
} else if (offset == UCSI_MESSAGE_IN) {
spin_lock(&uc->op_lock);
memcpy(val, &uc->op_data.message_in, val_len);
spin_unlock(&uc->op_lock);
} else {
ret = ccg_read(uc, reg, val, val_len);
}

You seem to be only calling this if that value is set correctly, but
this seems very fragile. You are also only calling this in one place,
so why is this a function at all? Just do the copy under the lock as
needed in the calling location instead.

+ spin_unlock(&uc->op_lock);
+}
+
+static void ccg_op_region_update(struct ucsi_ccg *uc, u32 cci)
+{
+ u16 reg = CCGX_RAB_UCSI_DATA_BLOCK(UCSI_MESSAGE_IN);
+ struct op_region *data = &uc->op_data;
+ u32 message_in[CCGX_MESSAGE_IN_MAX];

Are you sure you can put this big of a buffer on the stack?

It is 16 bytes total and so we did not think it was too big.

+
+ if (UCSI_CCI_LENGTH(cci))
+ if (ccg_read(uc, reg, (void *)&message_in,
+ sizeof(message_in))) {

Are you allowed to copy in into stack memory? This ends up being an i2c
message, right? Can that be transferred into non-dma-able memory?

Yes the existing callers of ccg_read() are also using buffers on the stack for reading the data into.

+ dev_err(uc->dev, "failed to read MESSAGE_IN\n");

Why can you not fail this function? You are throwing away the error
here, that's not good.

Agree. We can take a look at this.

+ return;
+ }
+
+ spin_lock(&uc->op_lock);
+ memcpy(&data->cci, &cci, sizeof(cci));

Perhaps just:
data->cci = cci;
as this is only a 32bit value.

Agree.

+ if (UCSI_CCI_LENGTH(cci))
+ memcpy(&data->message_in, &message_in, sizeof(message_in));
+ spin_unlock(&uc->op_lock);
+}
+
+static void ccg_op_region_clean(struct ucsi_ccg *uc)
+{
+ struct op_region *data = &uc->op_data;
+
+ spin_lock(&uc->op_lock);
+ memset(&data->cci, 0, sizeof(data->cci));

data->cci = 0;

+ memset(&data->message_in, 0, sizeof(data->message_in));

Or better yet, do it all at once:
memset(&data, 0, sizeof(*data));

+ spin_unlock(&uc->op_lock);

But why do you need to do this at all? Why "clean" the whole buffer
out, why not just set cci to 0 and be done with it?

Or why even clean this out at all, what happens if you do not?


I have been taking a look at this. If we don't clean the variable and buffer, then the previous state could be incorrectly read again after the next command has been sent.

Without this fix we occasionally see timeout errors such as ...

ucsi_ccg 2-0008: error -ETIMEDOUT: PPM init failed (-110)


I tried not doing this at all, but then we see these timeout issues are still seen.

+}
+
static int ucsi_ccg_init(struct ucsi_ccg *uc)
{
unsigned int count = 10;
u8 data;
int status;
+ spin_lock_init(&uc->op_lock);
+
data = CCGX_RAB_UCSI_CONTROL_STOP;
status = ccg_write(uc, CCGX_RAB_UCSI_CONTROL, &data, sizeof(data));
if (status < 0)
@@ -520,9 +578,13 @@ static int ucsi_ccg_read(struct ucsi *ucsi, unsigned int offset,
u16 reg = CCGX_RAB_UCSI_DATA_BLOCK(offset);
struct ucsi_capability *cap;
struct ucsi_altmode *alt;
- int ret;
+ int ret = 0;
+
+ if ((offset == UCSI_CCI) || (offset == UCSI_MESSAGE_IN))
+ ccg_op_region_read(uc, offset, val, val_len);
+ else
+ ret = ccg_read(uc, reg, val, val_len);
- ret = ccg_read(uc, reg, val, val_len);
if (ret)
return ret;
@@ -559,9 +621,13 @@ static int ucsi_ccg_read(struct ucsi *ucsi, unsigned int offset,
static int ucsi_ccg_async_write(struct ucsi *ucsi, unsigned int offset,
const void *val, size_t val_len)
{
+ struct ucsi_ccg *uc = ucsi_get_drvdata(ucsi);
u16 reg = CCGX_RAB_UCSI_DATA_BLOCK(offset);
- return ccg_write(ucsi_get_drvdata(ucsi), reg, val, val_len);
+ if (offset == UCSI_CONTROL)
+ ccg_op_region_clean(uc);

Why is this needed? You have not documented it the need for this.

When we send a new command we need to clear out the previous state, if we don't we are seeing those timeouts. When we issue the next control command we are expecting a new state and so it does make sense to clear it here.

In general, I think we can improve this patch and add some more comments. I will work with Haotien and Sing-Han on this.

Cheers
Jon

--
nvpublic