Re: oops in tracepoint_update_probe_range() (was: Re: [oops -tip]: x86 AMD 64)

From: Frederic Weisbecker
Date: Wed Mar 18 2009 - 13:34:11 EST


On Wed, Mar 18, 2009 at 06:27:50PM +0100, Ingo Molnar wrote:
>
> * Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>
> > On Wed, Mar 18, 2009 at 10:18:56PM +0530, Jaswinder Singh Rajput wrote:
> > > On Wed, 2009-03-18 at 17:35 +0100, Ingo Molnar wrote:
> > > > * Jaswinder Singh Rajput <jaswinder@xxxxxxxxxx> wrote:
> > > >
> > > > > Good: f4c3c4cdb1de232
> > > > > Bad : 1e08816af0bc345
> > > > >
> > > > > Config:
> > > > > http://userweb.kernel.org/~jaswinder/oops_20090318/config-hpdv5-tip-bad-20090318
> > > > >
> > > > > oops:
> > > > > http://userweb.kernel.org/~jaswinder/oops_20090318/oops_page1.jpg
> > > > > http://userweb.kernel.org/~jaswinder/oops_20090318/oops_page2.jpg
> > > > > http://userweb.kernel.org/~jaswinder/oops_20090318/oops_page3.jpg
> > > > > http://userweb.kernel.org/~jaswinder/oops_20090318/oops_page4.jpg
> > > > >
> > > > > <freeze>
> > > >
> > > > Steve, Frederic - the crashes above are in:
> > > >
> > > > tracepoint_update_probe_range()
> > > >
> > > > in a modular kernel apparently.
> > > >
> > >
> > > This fixed the oops for me, Is this looks OK to you:
> > >
> > > Subject: [PATCH] x86: tracepoint.c fix oops
> > >
> > > BUG: unable to handle kernel NULL pointer dereference at (null)
> > > IP: [<ffffffff8107d4de>] tracepoint_update_probe_range+0x1f/0x9b
> > > PGD 13d5fb067 PUD 13d688067 PMD 0
> > > Oops: 0000 [#1] SMP
> > >
> > > Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@xxxxxxxxx>
> > > ---
> > > kernel/tracepoint.c | 3 +++
> > > 1 files changed, 3 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> > > index 7960274..80d1353 100644
> > > --- a/kernel/tracepoint.c
> > > +++ b/kernel/tracepoint.c
> > > @@ -280,6 +280,8 @@ void tracepoint_update_probe_range(struct tracepoint *begin,
> > >
> > > mutex_lock(&tracepoints_mutex);
> > > for (iter = begin; iter < end; iter++) {
> > > + if (!iter)
> > > + goto out;
> > > mark_entry = get_tracepoint(iter->name);
> > > if (mark_entry) {
> > > set_tracepoint(&mark_entry, iter,
> > > @@ -288,6 +290,7 @@ void tracepoint_update_probe_range(struct tracepoint *begin,
> > > disable_tracepoint(iter);
> > > }
> > > }
> > > +out:
> > > mutex_unlock(&tracepoints_mutex);
> > > }
> >
> >
> > Ok, it should fix the crash.
> > But I think the real problem remains: iter is not supposed to point to NULL,
> > this is a section range:
> >
> > tracepoint_update_probe_range(__start___tracepoints,
> > __stop___tracepoints);
> >
> > It seems to mean that the section is empty.
>
> OK - so checking for !iter on entry and emitting a WARN_ONCE in that
> case ought to change the crash for a warning, right?
>
> Ingo


The real bug is elsewhere, Jaswinder has one user of tracepoints
which is blktrace.
So this section is not supposed to be empty.

But, I guess it is possible to raise via a randconfig by enabling
CONFIG_TRACEPOINTS without any user of it because when a module
is loaded, the section is checked without the !NULL safety.

Thus I think Jaswinder's patch could be picked without WARN or anything,
because this situation can happen in a normal randconfig case.

Concerning the real bug (there is blktrace here, so this NULL is weird)
I'm building Jaswinder's config, may be I will find something...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/