Re: [PATCH v8 1/8] pwm: pca9685: Switch to atomic API
From: Clemens Gruber
Date:  Tue Apr 13 2021 - 08:17:33 EST
On Tue, Apr 13, 2021 at 02:11:38PM +0200, Clemens Gruber wrote:
> Hi Uwe,
> 
> On Mon, Apr 12, 2021 at 10:10:19PM +0200, Uwe Kleine-König wrote:
> > Hello Clemens,
> > 
> > On Mon, Apr 12, 2021 at 06:39:28PM +0200, Clemens Gruber wrote:
> > > On Mon, Apr 12, 2021 at 06:18:08PM +0200, Uwe Kleine-König wrote:
> > > > On Mon, Apr 12, 2021 at 03:27:38PM +0200, Clemens Gruber wrote:
> > > > > The switch to the atomic API goes hand in hand with a few fixes to
> > > > > previously experienced issues:
> > > > > - The duty cycle is no longer lost after disable/enable (previously the
> > > > >   OFF registers were cleared in disable and the user was required to
> > > > >   call config to restore the duty cycle settings)
> > > > > - If one sets a period resulting in the same prescale register value,
> > > > >   the sleep and write to the register is now skipped
> > > > > - Previously, only the full ON bit was toggled in GPIO mode (and full
> > > > >   OFF cleared if set to high), which could result in both full OFF and
> > > > >   full ON not being set and on=0, off=0, which is not allowed according
> > > > >   to the datasheet
> > > > > - The OFF registers were reset to 0 in probe, which could lead to the
> > > > >   forbidden on=0, off=0. Fixed by resetting to POR default (full OFF)
> > > > > 
> > > > > Signed-off-by: Clemens Gruber <clemens.gruber@xxxxxxxxxxxx>
> > > > > ---
> > > > > Changes since v7:
> > > > > - Moved check for !state->enabled before prescaler configuration
> > > > > - Removed unnecessary cast
> > > > > - Use DIV_ROUND_DOWN in .apply
> > > > > 
> > > > > Changes since v6:
> > > > > - Order of a comparison switched for improved readability
> > > > > 
> > > > > Changes since v5:
> > > > > - Function documentation for set_duty
> > > > > - Variable initializations
> > > > > - Print warning if all LEDs channel
> > > > > - Changed EOPNOTSUPP to EINVAL
> > > > > - Improved error messages
> > > > > - Register reset corrections moved to this patch
> > > > > 
> > > > > Changes since v4:
> > > > > - Patches split up
> > > > > - Use a single set_duty function
> > > > > - Improve readability / new macros
> > > > > - Added a patch to restrict prescale changes to the first user
> > > > > 
> > > > > Changes since v3:
> > > > > - Refactoring: Extracted common functions
> > > > > - Read prescale register value instead of caching it
> > > > > - Return all zeros and disabled for "all LEDs" channel state
> > > > > - Improved duty calculation / mapping to 0..4096
> > > > > 
> > > > > Changes since v2:
> > > > > - Always set default prescale value in probe
> > > > > - Simplified probe code
> > > > > - Inlined functions with one callsite
> > > > > 
> > > > > Changes since v1:
> > > > > - Fixed a logic error
> > > > > - Impoved PM runtime handling and fixed !CONFIG_PM
> > > > > - Write default prescale reg value if invalid in probe
> > > > > - Reuse full_off/_on functions throughout driver
> > > > > - Use cached prescale value whenever possible
> > > > > 
> > > > >  drivers/pwm/pwm-pca9685.c | 259 +++++++++++++-------------------------
> > > > >  1 file changed, 89 insertions(+), 170 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/pwm/pwm-pca9685.c b/drivers/pwm/pwm-pca9685.c
> > > > > index 4a55dc18656c..827b57ced3c2 100644
> > > > > --- a/drivers/pwm/pwm-pca9685.c
> > > > > +++ b/drivers/pwm/pwm-pca9685.c
> > > > > @@ -51,7 +51,6 @@
> > > > >  #define PCA9685_PRESCALE_MAX	0xFF	/* => min. frequency of 24 Hz */
> > > > >  
> > > > >  #define PCA9685_COUNTER_RANGE	4096
> > > > > -#define PCA9685_DEFAULT_PERIOD	5000000	/* Default period_ns = 1/200 Hz */
> > > > >  #define PCA9685_OSC_CLOCK_MHZ	25	/* Internal oscillator with 25 MHz */
> > > > >  
> > > > >  #define PCA9685_NUMREGS		0xFF
> > > > > @@ -71,10 +70,14 @@
> > > > >  #define LED_N_OFF_H(N)	(PCA9685_LEDX_OFF_H + (4 * (N)))
> > > > >  #define LED_N_OFF_L(N)	(PCA9685_LEDX_OFF_L + (4 * (N)))
> > > > >  
> > > > > +#define REG_ON_H(C)	((C) >= PCA9685_MAXCHAN ? PCA9685_ALL_LED_ON_H : LED_N_ON_H((C)))
> > > > > +#define REG_ON_L(C)	((C) >= PCA9685_MAXCHAN ? PCA9685_ALL_LED_ON_L : LED_N_ON_L((C)))
> > > > > +#define REG_OFF_H(C)	((C) >= PCA9685_MAXCHAN ? PCA9685_ALL_LED_OFF_H : LED_N_OFF_H((C)))
> > > > > +#define REG_OFF_L(C)	((C) >= PCA9685_MAXCHAN ? PCA9685_ALL_LED_OFF_L : LED_N_OFF_L((C)))
> > > > 
> > > > I'd like to see these named PCA9685_REG_ON_H etc.
> > > 
> > > I did not use the prefix because the existing LED_N_ON/OFF_H/L also do
> > > not have a prefix. If the prefix is mandatory, I think LED_N_.. should
> > > also be prefixed, right?
> > 
> > I'd like to seem the prefixed (and assume that Thierry doesn't agree).
> > IMHO it's good style and even though it yields longer name usually it
> > yields easier understandable code. (But this seems to be subjective.)
> 
> I am not sure I want to also rename the existing LED_N_OFF stuff in this
> patch. Maybe we can discuss unifying the macros (either with or without
> prefix) in a later patch and I keep the REG_ON_ stuff for now without to
> match the LED_N_ stuff?
> 
> > 
> > > > > +	val = 0;
> > > > > +	regmap_read(pca->regmap, LED_N_OFF_L(channel), &val);
> > > > 
> > > > I asked in the last round why you initialize val. You answered "just to
> > > > have it set to 0 in case regmap_read fails / val was not set." I wonder
> > > > if
> > > > 
> > > > 	ret = regmap_read(pca->regmap, LED_N_OFF_L(channel), &val);
> > > > 	if (!ret)
> > > > 		/* 
> > 
> > I intended to write something like
> > 
> > 		/* initialize val in case reading LED_N_OFF failed */
> > 
> > > > 		val = 0
> > > > 
> > > > would be better then and also make the intention obvious.
> 
> Maybe a little bit better.. I can change it.
> 
> > > 
> > > I am not sure if that's more clear, but if others find it more obvious
> > > like this, I can change it.
> > > 
> > > > 
> > > > > +	return ((off_h & 0xf) << 8) | (val & 0xff);
> > > > > +}
> > > > > +
> > > > >  #if IS_ENABLED(CONFIG_GPIOLIB)
> > > > >  static bool pca9685_pwm_test_and_set_inuse(struct pca9685 *pca, int pwm_idx)
> > > > >  {
> > > > > @@ -138,34 +186,23 @@ static int pca9685_pwm_gpio_request(struct gpio_chip *gpio, unsigned int offset)
> > > > >  static int pca9685_pwm_gpio_get(struct gpio_chip *gpio, unsigned int offset)
> > > > >  {
> > > > >  	struct pca9685 *pca = gpiochip_get_data(gpio);
> > > > > -	struct pwm_device *pwm = &pca->chip.pwms[offset];
> > > > > -	unsigned int value;
> > > > >  
> > > > > -	regmap_read(pca->regmap, LED_N_ON_H(pwm->hwpwm), &value);
> > > > > -
> > > > > -	return value & LED_FULL;
> > > > > +	return pca9685_pwm_get_duty(pca, offset) != 0;
> > > > >  }
> > > > >  
> > > > >  static void pca9685_pwm_gpio_set(struct gpio_chip *gpio, unsigned int offset,
> > > > >  				 int value)
> > > > >  {
> > > > >  	struct pca9685 *pca = gpiochip_get_data(gpio);
> > > > > -	struct pwm_device *pwm = &pca->chip.pwms[offset];
> > > > > -	unsigned int on = value ? LED_FULL : 0;
> > > > > -
> > > > > -	/* Clear both OFF registers */
> > > > > -	regmap_write(pca->regmap, LED_N_OFF_L(pwm->hwpwm), 0);
> > > > > -	regmap_write(pca->regmap, LED_N_OFF_H(pwm->hwpwm), 0);
> > > > >  
> > > > > -	/* Set the full ON bit */
> > > > > -	regmap_write(pca->regmap, LED_N_ON_H(pwm->hwpwm), on);
> > > > > +	pca9685_pwm_set_duty(pca, offset, value ? PCA9685_COUNTER_RANGE : 0);
> > > > >  }
> > > > >  
> > > > >  static void pca9685_pwm_gpio_free(struct gpio_chip *gpio, unsigned int offset)
> > > > >  {
> > > > >  	struct pca9685 *pca = gpiochip_get_data(gpio);
> > > > >  
> > > > > -	pca9685_pwm_gpio_set(gpio, offset, 0);
> > > > > +	pca9685_pwm_set_duty(pca, offset, 0);
> > > > >  	pm_runtime_put(pca->chip.dev);
> > > > >  	pca9685_pwm_clear_inuse(pca, offset);
> > > > >  }
> > > > > @@ -246,167 +283,52 @@ static void pca9685_set_sleep_mode(struct pca9685 *pca, bool enable)
> > > > >  	}
> > > > >  }
> > > > >  
> > > > > -static int pca9685_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
> > > > > -			      int duty_ns, int period_ns)
> > > > > +static int pca9685_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> > > > > +			     const struct pwm_state *state)
> > > > >  {
> > > > > [...]
> > > > > +	prescale = DIV_ROUND_CLOSEST_ULL(PCA9685_OSC_CLOCK_MHZ * state->period,
> > > > > +					 PCA9685_COUNTER_RANGE * 1000) - 1;
> > > > > +	if (prescale < PCA9685_PRESCALE_MIN || prescale > PCA9685_PRESCALE_MAX) {
> > > > > +		dev_err(chip->dev, "pwm not changed: period out of bounds!\n");
> > > > > +		return -EINVAL;
> > > > >  	}
> > > > >  
> > > > > [...]
> > > > > +	if (!state->enabled) {
> > > > > +		pca9685_pwm_set_duty(pca, pwm->hwpwm, 0);
> > > > >  		return 0;
> > > > >  	}
> > > > >  
> > > > > [...]
> > > > > +	regmap_read(pca->regmap, PCA9685_PRESCALE, &val);
> > > > > +	if (prescale != val) {
> > > > > +		/*
> > > > > +		 * Putting the chip briefly into SLEEP mode
> > > > > +		 * at this point won't interfere with the
> > > > > +		 * pm_runtime framework, because the pm_runtime
> > > > > +		 * state is guaranteed active here.
> > > > > +		 */
> > > > > +		/* Put chip into sleep mode */
> > > > > +		pca9685_set_sleep_mode(pca, true);
> > > > >  
> > > > > [...]
> > > > > +		/* Change the chip-wide output frequency */
> > > > > +		regmap_write(pca->regmap, PCA9685_PRESCALE, prescale);
> > > > >  
> > > > > -	regmap_update_bits(pca->regmap, reg, LED_FULL, 0x0);
> > > > > +		/* Wake the chip up */
> > > > > +		pca9685_set_sleep_mode(pca, false);
> > > > > +	}
> > > > >  
> > > > > +	duty = PCA9685_COUNTER_RANGE * state->duty_cycle;
> > > > > +	duty = DIV_ROUND_DOWN_ULL(duty, state->period);
> > > > 
> > > > If you round down here you should probably also round down in the
> > > > calculation of prescale. Also note that you're losing precision by using
> > > > state->period.
> > > > 
> > > > Consider the following request: state->period = 4177921 [ns] +
> > > > state->duty_cycle = 100000 [ns], then we get
> > > > PRESCALE = round(25 * state->period / 4096000) - 1 = 25 and so an actual
> > > > period of 4096000 / 25 * (25 + 1) = 4259840 [ns]. If you now calculate
> > > > the duty using 4096 * 100000 / 4177920 = 98, this corresponds to an
> > > > actual duty cycle of 98 * 4259840 / 4096 = 101920 ns while you should
> > > > actually configure 96 to get 99840 ns.
> > > > 
> > > > So in the end I'd like to have the following:
> > > > 
> > > > 	PRESCALE = round-down(25 * state->period / 4096000) - 1
> > > > 
> > > > (to get the biggest period not bigger than state->period) and
> > > > 
> > > > 	duty = round-down(state->duty_cycle * 25 / ((PRESCALE + 1) * 1000))
> > > > 
> > > > (to get the biggest duty cycle not bigger than state->duty_cycle). With
> > > > the example above this yields
> > > > 
> > > > 	PRESCALE = 24
> > > > 	duty = 100
> > > > 
> > > > which results in
> > > > 
> > > > 	.period = 4096000 / 25 * 25 = 4096000 [ns]
> > > > 	.duty_cycle = 100000 [ns]
> > > > 	
> > > > Now you have a mixture of old and new with no consistent behaviour. So
> > > > please either stick to the old behaviour or do it right immediately.
> > > 
> > > I avoided rounding down the prescale value because the datasheet has an
> > > example where a round-closest is used, see page 25.
> > 
> > The hardware guy who wrote this data sheet wasn't aware of the rounding
> > rules for Linux PWM drivers :-)
> > 
> > > With your suggested round-down, the example with frequency of 200 Hz
> > > would no longer result in 30 but 29 and that contradicts the datasheet.
> > 
> > Well, with PRESCALE = 30 we get a frequency of 196.88 Hz and with
> > PRESCALE = 29 we get a frequency of 203.45 Hz. So no matter if you pick
> > 29 or 30, you don't get 200 Hz. And which of the two possible values is
> > the better one depends on the consumer, no matter what rounding
> > algorithm the data sheet suggests. Also note that the math here contains
> > surprises you don't expect at first. For example, what PRESCALE value
> > would you pick to get 284 Hz? [If my mail was a video, I'd suggest to
> > press Space now to pause and let you think first :-)] The data sheet's
> > formula suggests:
> > 
> > 	round(25 MHz / (4096 * 284)) - 1 = 20
> > 
> > The resulting frequency when picking PRESCALE = 20 is 290.644 Hz (so an
> > error of 6.644 Hz). If instead you pick PRESCALE = 21 you get 277.433 Hz
> > (error = 6.567 Hz), so 21 is the better choice.
> > 
> > Exercise for the reader:
> >  What is the correct formula to really determine the PRESCALE value that
> >  yields the best approximation (i.e. minimizing
> >  abs(real_freq - target_freq)) for a given target_freq?
> > 
> > These things don't happen when you round down only.
> 
> Sure, but it might also be counterintuitive that the Linux driver does
> not use the same formula as the datasheet. And when using 200 Hz, 29 is
> a little closer than 30.
Sorry, I meant: 30 (196.88 Hz) is a little closer than 29 (203.45 Hz).
> I once measured the actual frequency and the internal oscillator is not
> very accurate, so even if you think you should get 196.88 Hz, the actual
> frequency measured with a decent scope is about 206 Hz and varies from
> chip to chip (~ 205-207 Hz).
> 
> > 
> > > So would you rather have me keep the old duty rounding behaviour?
> > > 
> > > Meaning: Keep rounding up the duty calculation in apply and use
> > > round-down in the new .get_state function?
> > 
> > There are two things I want:
> > 
> >  a) To improve consistency among the PWM drivers (and to keep the math
> >     simple and unsurprising), the pca9685 driver should use round-down
> >     instead of round-nearest (or whatever mix it is currently using).
> > 
> >  b) .get_state should be the inverse to .apply in the sense that
> >     applying the result of .get_state is idempotent.
> > 
> > I don't care much how you get there, so it's up to you if you do that in
> > this patch that converts to .apply, or if you keep the math as is and
> > then adapt the rounding behaviour in a separate change. But changing the
> > algorithm in this patch and not getting to the "good" one is ugly, so
> > please don't do that.
> 
> OK, then I think the best option is to keep the math as it was before
> and if we want to change the rounding behaviour we do this in a separate
> patch in the future. Then we can continue the discussion wether changing
> the prescaler formula to round-down even though the datasheet does it
> otherwise is the way to go.
> 
> Thanks,
> Clemens