Re: [PATCH v3 2/3] hwmon: da9063: HWMON driver

From: Guenter Roeck
Date: Sun Jul 11 2021 - 00:45:31 EST


On 7/10/21 7:55 PM, Vincent Pelletier wrote:
Hello,

Thanks a lot for this new review (and sorry for the previous
very-incomplete send, unfortunate keyboard shortcut and sleepy fingers).

On Sat, 10 Jul 2021 09:08:13 -0700, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
Unnecessary include.
[...]
I don't immediately see where this include is needed. Is this a
leftover ?
[...]
Same here.

Are there ways to systematically tell which includes are useless
besides commenting them out all and uncommenting until it compiles ?
(if that is even a good idea)


I am sure there are, but I don't know any pointers. Either case, commenting
out include files until it fails to compile is not a good idea.
The driver then may compile with one architecture but fail with another.

+enum da9063_adc {
+ DA9063_CHAN_VSYS = DA9063_ADC_MUX_VSYS,
+ DA9063_CHAN_ADCIN1 = DA9063_ADC_MUX_ADCIN1,
+ DA9063_CHAN_ADCIN2 = DA9063_ADC_MUX_ADCIN2,
+ DA9063_CHAN_ADCIN3 = DA9063_ADC_MUX_ADCIN3,
+ DA9063_CHAN_TJUNC = DA9063_ADC_MUX_T_SENSE,
+ DA9063_CHAN_VBBAT = DA9063_ADC_MUX_VBBAT,
+ DA9063_CHAN_LDO_G1 = DA9063_ADC_MUX_LDO_G1,
+ DA9063_CHAN_LDO_G2 = DA9063_ADC_MUX_LDO_G2,
+ DA9063_CHAN_LDO_G3 = DA9063_ADC_MUX_LDO_G3

Many of the above defines are not used. Do you plan a follow-up commit
to use them ? Otherwise please drop unused defines.

I'm not sure (would like to, but for this I think I need to add
devicetree controls, and I am not sure how this should look like), so in
doubt I will drop them from this patch set.

There are also #defines in this patchset related to ADCIN channels,
which are hence unused. Should I also drop these ? In my (short)
experience, there seem to regularly be unused #defines in headers, so I
left them be.


Please drop them. They can be added back as needed.

+struct da9063_hwmon {
+ struct da9063 *da9063;
+ struct mutex hwmon_mutex;
+ struct completion adc_ready;
+ signed char tjunc_offset;

I am curious: 'char' implies 'signed'. Any reason for using 'signed' ?

We are again getting into my "erring on the status-quo side" as this
comes from the original patchset. My reading of this is that using a
char for holding an integer is somewhat unusual (as opposed to a holding
character) and the non-essential "signed" would signal that there is
something maybe a bit unusual going on here.

But this all becomes moot with your next point:

Also, note that on most architectures the resulting code is more complex
when using 'char' instead of 'int'. This is seen easily by compiling the
driver for arm64: Replacing the above 'signed char' with 'int' reduces
code size by 32 bytes.

This is reaching outside of the parts of C that I am comfortable in:
what is the correct way to sign-extend an 8-bits value into an int ?

In regmap_read() fills "int *value" with the read bytes, not
sign-extended (which looks sane):
ret = regmap_read(da9063->regmap, DA9063_REG_T_OFFSET, &tmp);
dev_warn(&pdev->dev, "da9063_hwmon_probe offset=%d\n", tmp);
->
[Jul11 01:53] da9063-hwmon da9063-hwmon: da9063_hwmon_probe offset=247

My naïve "(int)((char)tmp)" produces 247, instead of -9.
"(int)hwmon->tjunc_offset" does sign-extend, but going through an
intermediate variable looks overcomplex to me (for a tiny definition of
"overcomplex").
I see sign_extend*() functions but seeing their bitshift arguments I
feel these may not be intended for such no-shift-needed use.

Sorry, you lost me there. Those functions use shift operations to move
the sign bit where it belongs, and the shift back retains the sign bit.
What is wrong with that ?

Also:

int main()
{
unsigned int v1 = 247;
int v2;
int v3;

v2 = (char)v1;
v3 = (int)((char)v1);

printf("%d %d %d\n", v1, v2, v3);

return 0;
}

produces 247 -9 -9, so I don't fully follow what your (int)((char)tmp)
looks like. Besides, the outer typecast is not necessary.
In general,
v2 = (char)v1;
is good enough since the char -> int conversion is automatic (and sign_extend32()
is indeed overkill for this situation).

Either case, please feel free to use 'char' if you like; I won't insist
on a change to int. However, please drop the "signed".

+static int da9063_adc_manual_read(struct da9063_hwmon *hwmon, int channel)
[...]
+ ret = wait_for_completion_timeout(&hwmon->adc_ready,
+ msecs_to_jiffies(100));
+ reinit_completion(&hwmon->adc_ready);

This is unusual. Normally I see init_completion() or reinit_completion()
ahead of calls to wait functions.

If a request timed out and an interrupt happened after the timeout,
the next request would return immediately with the previous result,
since complete() would be called on the re-initialized completion
handler. That doesn't seem to be correct to me.

To confirm my comprehension: the issue is that if somehow the irq
handler fires outside a conversion request, it will mark adf_ready as
completed, so wait_for_completion_timeout() will immediately return.
The follow-up consequences being that the ADC, having just been asked
to do a new conversion, will still be busy, leading to a spurious
ETIMEDOUT.
Is this correct ?

I don't know what exactly happens. Why don't you try by setting the
timeout to a really small value, one that _does_ result in this
situation ?

With this in mind, could the time from regmap_update_bits() to
{,re}init_completion() be longer than the time the IRQ could take to
trigger ? In which case adc_ready would be marked as completed, then it
would be cleared, and wait_for_completion_timeout() would reach its
timeout despite the conversion being already over.

... but what I do know is that I don't understand why you insist having
the reinit_completion() _after_ the wait call. The above doesn't explain
that. I see it as potentially racy, so if you want to keep the code as-is
I'll want to see a comment in the code explaining why it has to be done
this way, and how it is not racy.

Also: a return value of 0 from wait_for_completion_timeout()
already indicates a timeout. The subsequent regmap_read() to check
if the conversion is complete should not be necessary. If it does,
it really indicates a non-timeout problem. Are there situations
(other than the race condition I am concerned about) where
an interrupt can happen but DA9063_ADC_MAN is still set ?

If so, I think this needs a comment in the code, especially since there
is an extra i2c read which, after all, is costly. Also, this should
probably generate a different error code (-EIO, maybe), and
-ETIMEDOUT should be the result of wait_for_completion_timeout()
returning 0.

+static int da9063_hwmon_probe(struct platform_device *pdev)
[...]
+ ret = regmap_read(da9063->regmap, DA9063_REG_T_OFFSET, &tmp);
+ if (ret < 0) {
+ tmp = 0;
+ dev_warn(&pdev->dev,
+ "Temperature trimming value cannot be read (defaulting to 0)\n");
+ }
+ hwmon->tjunc_offset = (signed char) tmp;

Nit: Unnecessary space after typecast (checkpatch --strict would tell you).

Also, I am curious: The temperature offset is a standard hwmon attribute.
Is it an oversight to not report it, or is it on purpose ?

It was an oversight, but now that I know about it I am not sure this
should be used: the offset is in chip-internal ADC units, so userland
cannot make use of it for temperature measurement unless the raw ADC
output is also exposed.

One would not report the raw value, but convert it to m°C.

Is this attribute used to give an insight as to how the chip was
calibrated in-factory or otherwise good practice to expose ?


It can be exposed as read-only value if it is a read-only
register/value. Ultimately it is your call if it is indeed read-only.
It still provides some value in that case, but not much.

Thanks,
Guenter