Re: [watchdog] combine nmi_watchdog and softlockup

From: Cyrill Gorcunov
Date: Tue Apr 06 2010 - 11:31:32 EST


On Tue, Apr 06, 2010 at 04:13:30PM +0200, Frederic Weisbecker wrote:
[...]
> > +static int watchdog_enable(int cpu)
> > +{
> > + struct perf_event_attr *wd_attr;
> > + struct perf_event *event = per_cpu(watchdog_ev, cpu);
> > + struct task_struct *p = per_cpu(softlockup_watchdog, cpu);
> > +
> > + /* is it already setup and enabled? */
> > + if (event && event->state > PERF_EVENT_STATE_OFF)
> > + goto out;
> > +
> > + /* it is setup but not enabled */
> > + if (event != NULL)
> > + goto out_enable;
> > +
> > + /* Try to register using hardware perf events first */
> > + wd_attr = &wd_hw_attr;
> > + wd_attr->sample_period = hw_nmi_get_sample_period();
> > + event = perf_event_create_kernel_counter(wd_attr, cpu, -1, watchdog_overflow_callback);
> > + if (!IS_ERR(event)) {
> > + printk(KERN_INFO "NMI watchdog enabled, takes one hw-pmu counter.\n");
> > + goto out_save;
> > + }
> > +
> > + /* hardware doesn't exist or not supported, fallback to software events */
> > + printk(KERN_INFO "NMI watchdog: hardware not available, trying software events\n");
> > + wd_attr = &wd_sw_attr;
> > + wd_attr->sample_period = softlockup_thresh * NSEC_PER_SEC;
> > + event = perf_event_create_kernel_counter(wd_attr, cpu, -1, watchdog_overflow_callback);
>
> I fear the cpu clock is not going to help you detecting any hard lockups.
> If you're stuck in an interrupt or an irq disabled loop, your cpu clock is
> not going to fire.
>

I guess it's not supposed to. For such cases only nmi irqs may help for which
the perf events are there (/me need to check if we program apic timer for anything
like that). But it should help for other deadlocks. Or I miss something?

-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/