Re: [PATCH v3 3/4] tpm_tis: Disable interrupts if interrupt storm detected

From: Thomas Gleixner
Date: Sun Dec 06 2020 - 14:29:18 EST


Jerry,

On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:
> @@ -715,9 +717,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
> {
> struct tpm_chip *chip = dev_id;
> struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
> + static bool check_storm = true;
> + static unsigned int check_start;

So this assumes that there can't be two TPMs which is probably true, but
everything else in this driver has stuff in tpm_tis_data per device.

> u32 interrupt;
> int i, rc;
>
> + if (unlikely(check_storm)) {
> + if (!check_start) {
> + check_start = jiffies_to_msecs(jiffies);

Yuck. I had to read that twice to figure out that it's correct vs. the
truncation of the result to unsigned int. You can spare that conversion
by simply doing

unsigned long end_of_check = jiffies + HZ / 2;

and then the check becomes

time_before(jiffies, end_of_check)

> + } else if ((kstat_irqs(priv->irq) > 1000) &&
> + (jiffies_to_msecs(jiffies) - check_start < 500)) {

I assume you can't call disable_irq_nosync() here, but shouldn't this
shut up the interrupt at the TPM level right here?

> + check_storm = false;
> + schedule_work(&priv->storm_work);
> + } else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
> + check_storm = false;
> + }
> + }

So back to kstat_irqs(). As this needs two extra variables anyway:

init()
priv->irq_check = 1;
priv->end_check = 0;

isr()
if (unlikely(priv->irq_check)) {
if (!priv->end_check) {
priv->end_check = jiffies + HZ / 2;
} else if (time_before(jiffies, priv->end_check)) {
if (priv->irq_check++ > 1000)
schedule_work(...);
} else {
priv->irq_check = 0;
}
}

Hmm? I still need to see an argument for an kstat_irqs() export being
superior.

Though I wonder whether such an infrastructure should be provided in the
irq core. Let me think about it.

Just as a side note. I was looking at tpm_tis_probe_irq_single() and
that function is leaking the interrupt request if any of the checks
afterwards fails, except for the final interrupt probe check which does
a cleanup. That means on fail before that the interrupt handler stays
requested up to the point where the module is removed. If that's a
shared interrupt and some other device is active on the same line, then
each interrupt from that device will call into the TPM code. Something
like the below is needed.

Also the X86 autoprobe mechanism is interesting:

if (IS_ENABLED(CONFIG_X86))
for (i = 3; i <= 15; i++)
if (!tpm_tis_probe_irq_single(chip, intmask, 0, i))
return;

The third argument is 'flags' which is handed to request_irq(). So that
won't ever be able to probe a shared interrupt. But if an interrupt
number > 0 is handed to tpm_tis_core_init() the interrupt is requested
with IRQF_SHARED. Same issue when the chip has an interrupt number in
the register. It's also requested exclusive which is pretty likely
to fail on ancient x86 machines.

The vast amount of comments didn't help to figure out what the reasoning
is.

Thanks,

tglx
---
drivers/char/tpm/tpm_tis_core.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -782,26 +782,26 @@ static int tpm_tis_probe_irq_single(stru
rc = tpm_tis_read8(priv, TPM_INT_VECTOR(priv->locality),
&original_int_vec);
if (rc < 0)
- return rc;
+ goto fail;

rc = tpm_tis_write8(priv, TPM_INT_VECTOR(priv->locality), irq);
if (rc < 0)
- return rc;
+ goto fail;

rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &int_status);
if (rc < 0)
- return rc;
+ goto fail;

/* Clear all existing */
rc = tpm_tis_write32(priv, TPM_INT_STATUS(priv->locality), int_status);
if (rc < 0)
- return rc;
+ goto fail;

/* Turn on */
rc = tpm_tis_write32(priv, TPM_INT_ENABLE(priv->locality),
intmask | TPM_GLOBAL_INT_ENABLE);
if (rc < 0)
- return rc;
+ goto fail;

priv->irq_tested = false;

@@ -825,6 +825,10 @@ static int tpm_tis_probe_irq_single(stru
}

return 0;
+
+fail:
+ disable_interrupts(chip);
+ return rc;
}

/* Try to find the IRQ the TPM is using. This is for legacy x86 systems that