Re: [PATCH] i2c: imx: increase retries on arbitration loss

From: Oleksij Rempel
Date: Fri Dec 30 2022 - 11:12:37 EST


On Fri, Dec 30, 2022 at 03:40:58PM +0100, Francesco Dolcini wrote:
> +Wolfram
>
> On Wed, Dec 28, 2022 at 09:01:46AM +0100, Primoz Fiser wrote:
> > On 16. 12. 22 13:51, Francesco Dolcini wrote:
> > > On Fri, Dec 16, 2022 at 01:23:29PM +0100, Primoz Fiser wrote:
> > > > The only solid point in the thread seems to be that in that case we are not
> > > > covering up the potential i2c hardware issues?
> > >
> > > I believe that in this case we should just have a warning in the kernel.
> > > The retry potentially work-around a transient issue and we do not hide any hardware
> > > issue at the same time. It seems an easy win-win solution.
> >
> > I would agree about throwing a warning message in retry case.
> >
> > Not sure how would it affect other i2c bus drivers using retries > 0.
> > Retries might be pretty rare with i2c-imx but some other drivers set this to
> > 5 for example. At least using _ratelimited printk is a must using this
> > approach.
>
> Wolfram, Uwe, Oleksij
>
> Would it be acceptable to have a warning when we have I2C retries, and
> with that in place enabling retries on the imx driver?
>
> It exists hardware that requires this to work correctly,

Well, this is persistent confusion in this monolog. It will not make it
correctly.

> and at a
> minimum setting the retry count from user space is not going to solve
> potential issues during initial driver probe.

I assume it is not clear from programmer point of view. Lets try other way:

- The I2C slave could not correctly interpret the data on SDA because the SDA
high or low-level voltages do not reach its appropriate input
thresholds.

This means:

You have this:

/-\ /-\ ----- 2.5Vcc
___/ \__/ \___

Instead of this:

/-\ /-\ ----- 3.3Vcc
/ \ / \
___/ \__/ \___

This is bad, because master or slave will not be able to interpret the pick level
correctly. It may see some times 0 instead of 1. This means, what ever we are
writing we are to the slave or reading from the slave is potentially corrupt
and only __sometimes__ the master was able to detect it.

- The I2C slave missed an SCL cycle because the SCL high or low-level voltages
do not reach its appropriate input thresholds.

This means, the bus frequency is too high for current configured or physical PCB
designed. So, you will have different kind of corruptions and some times they
will be detected.

- The I2C slave accidently interpreted a spike etc. as an SCL cycle.

This means the noise level is to high. The driver strange should be increased
or PCB redesign should be made. May be there are more options. If not done,
data corruption can be expected.

None of this issue can be "fixed" by retries or made more "robust".
Doing more retries means: we do what ever we do until the system was not able to
detect the error.

> To me the only reasonable solution is to have the retry enabled with a
> sensible number (3? 5?), however there is a concern that this might
> hide real hardware issues.

There is real hardware issue.

Regards,
Oleksij
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |