Re: perf/tracepoint: another fuzzer generated lockup

From: Peter Zijlstra
Date: Mon Nov 11 2013 - 10:54:26 EST


On Mon, Nov 11, 2013 at 01:44:19PM +0100, Ingo Molnar wrote:
>
> * Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>
> > > That said, I'm not sure what kernel you're running, but there were
> > > some issues with time-keeping hereabouts, but more importantly that
> > > second timing includes the printk() call of the first -- so that's
> > > always going to be fucked.
> >
> > It's a recent tip:master. So the delta debug printout is certainly
> > buggy, meanwhile these lockup only happen with Vince selftests, and they
> > trigger a lot of these NMI-too-long issues, or may be that's the other
> > way round :)...
> >
> > I'm trying to narrow down the issue, lets hope the lockup is not
> > actually due to printk itself.
>
> I'd _very_ strongly suggest to not include the printk() overhead in the
> execution time delta! What that function wants to report is pure NMI
> execution overhead, not problem reporting overhead.
>
> That way any large number reported there is always a bug somewhere,
> somehow.

-ENOPATCH :-)

You'll find that there's two levels of measuring NMI latency and the
outer will invariably include the reporting of the inner one; fixing
that is going to be hideously ugly.

That said, I would very strongly suggest to tear that printk() from the
NMI path, its just waiting to wreck someone's machine :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/