Re: [PATCH v3] genirq: add support for warning on long-running IRQ handlers

From: Jiri Slaby
Date: Thu Jul 24 2025 - 01:31:11 EST


On 24. 07. 25, 7:18, Jiri Slaby wrote:
On 23. 07. 25, 20:28, Wladislav Wiebe wrote:
Introduce a mechanism to detect and warn about prolonged IRQ handlers.
With a new command-line parameter (irqhandler.duration_warn_us=),
users can configure the duration threshold in microseconds when a warning
in such format should be emitted:

"[CPU14] long duration of IRQ[159:bad_irq_handler [long_irq]], took: 1330 us"

The implementation uses local_clock() to measure the execution duration of the
generic IRQ per-CPU event handler.
...> +static inline void irqhandler_duration_check(u64 ts_start, unsigned int irq,
+                         const struct irqaction *action)
+{
+    /* Approx. conversion to microseconds */
+    u64 delta_us = (local_clock() - ts_start) >> 10;

Is this a microoptimization -- have you measured what speedup does it bring? IOW is it worth it instead of cleaner "/ NSEC_PER_USEC"?

Or instead, you could store the diff in irqhandler_duration_threshold_ns (mind that "_ns") and avoid the shift and div completely.

And what about the wrap? Don't you need abs_diff()?

Not that ^^^, it won't work, but something else. But if I am counting correctly, the wrap is in 584 years if counted from 0. Well, for native/tsc, "Intel guarantees that the time-stamp counter will not wraparound within 10 years after being reset". I have no idea what virtualizations return to local_clock(). This is not my call to decide, though.

thanks,
--
js
suse labs