Re: hit a KASan bug related to Perf during stress test

From: Jiri Olsa
Date: Mon Oct 24 2016 - 08:04:19 EST


On Mon, Oct 24, 2016 at 01:29:45PM +0200, Peter Zijlstra wrote:
> On Mon, Oct 24, 2016 at 01:27:32PM +0200, Peter Zijlstra wrote:
> > On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote:
> > > How about the trivial fix below?
> > >
> > > Oleg.
> > >
> > > --- x/kernel/events/core.c
> > > +++ x/kernel/events/core.c
> > > @@ -1257,7 +1257,7 @@ static u32 perf_event_pid(struct perf_ev
> > > if (event->parent)
> > > event = event->parent;
> > >
> > > - return task_tgid_nr_ns(p, event->ns);
> > > + return pid_alive(p) ? task_tgid_nr_ns(p, event->ns) : 0;
> > > }
> >
> > Also, now we get a (few) sample(s) with a different pid:tid than prior
> > samples and not matching the sched_switch() events.
> >
> > I can imagine that being somewhat confusing for people/tools.
> >
> > Acme/Jolsa, any idea if that will bugger perf-report?
>
> Hurm, then again, I imagine that after unhash_process the PID/TID could
> be instantly re-used and then we're still confused.

sounds bad.. I haven't checked the related pid_alive code,
but shouldn't we already get the EXIT event in this case?

jirka