Re: [RFC] adding of TGID to ftrace output

From: Steven Rostedt
Date: Thu May 11 2017 - 11:02:50 EST


On Wed, 10 May 2017 20:10:06 -0700
Joel Fernandes <joelaf@xxxxxxxxxx> wrote:

> Hi Steven,
> Thanks for your quick reply.
>
> On Wed, May 10, 2017 at 6:28 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > On Wed, 10 May 2017 16:04:55 -0700
> > Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
> >
> >> Hi Steven,
> >>
> >> Can we add TGID information along with PID to ftrace output?
> >>
> >> Something like:
> >> # _-----=> irqs-off
> >> # / _----=> need-resched
> >> # | / _---=> hardirq/softirq
> >> # || / _--=> preempt-depth
> >> # ||| / delay
> >> # TASK-PID:TGID CPU# |||| TIMESTAMP FUNCTION
> >> # | | | | |||| | |
> >> bash-1977:1977 [000] .... 17284.993652: sys_close <-
> >>
> >> Instead of:
> >> # _-----=> irqs-off
> >> # / _----=> need-resched
> >> # | / _---=> hardirq/softirq
> >> # || / _--=> preempt-depth
> >> # ||| / delay
> >> # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> >> # | | | |||| | |
> >> bash-1977 [000] .... 17284.993652: sys_close <-
> >
> > Well the thing about this is that it adds more overhead to each event
> > that is mostly not needed by users.
>
> There will not be any overhead if we can retrieve the tgid at print
> time and not trace time. I was thinking something like
> trace_find_cmdline approach would work great. The only draw back of
> something like this is if a thread is added and removed to a thread
> group during a trace, then we might not have all tgid information
> available. I think this is the draw back of the trace_find_cmdline
> code as well where if a thread doesn't exist at trace output time,
> then we end up printing "<...>" as the task comm? At the end of this
> email, I make a suggestion to fix this.

Hmm, interesting idea.

>
> >> Currently in android, we inject tgid into each trace event at the end
> >> of the trace just so we can group threads under a process in our trace
> >> viewer.
> >>
> >> The other option we're thinking off is we can retrieve tgid during the
> >> trace output (reading trace file from debugfs/tracefs) from the
> >> task_struct and then have ftrace output it that way.
> >
> > Hmm, is there any events that show the TGID currently? If we have that,
> > you can use another tracing instance to read from (one that wont get
> > overridden by other events), and keep track of what TGID processes
> > have. The timestamp could be used to keep everything in sync.
>
> There is a sched_process_fork event that can show ppid but not tgid. I
> was thinking we modify that to show tgid as well, that could work.
> However the draw back doing this is we can only see when something is
> added or removed to the thread group during tracing. If something was
> already a part of the thread group _before_ tracing, then we don't
> have the information.
>
> >> Anyway, having this will really simplify things and make it more
> >> reliable (we run ps -T at the end of the trace right now, but its
> >> unreliable because threads might have exited by then). We also have to
> >> call getpid() and inject pid into each userspace trace_marker write
> >> just so we can do this grouping.
> >
> > Perhaps we need to have a way to record it via another event. Fork or
> > sched switch?
>
> Same reasoning above, that separate events are not enough to have this
> information as such events may not fire for the duration of the
> tracing for the threads we're interested in.

Well, you could always just record a ps as you start tracing, and then
you would have the tgids in your tool for those that are not created
during the trace. The ps would only need to be run once, and then the
tracing of fork will get the rest of your tgids during the trace.


>
> > Perhaps we can add a trigger to record extra data like this, similar to
> > the stack trace trigger does now.
>
> Are you talking about 1 event trigger another event to have more
> information? That would have the side-effect of having a separate
> event for each event that we need the tgid information for, thus
> causing lots of space wastage in the right buffer, right?

Yeah, that can add a lot more wasted space.

>
> To fix the suggestion I made in the very beginning of this email, I
> suggest a combination of trace_find_cmdline type of approach to save
> the tgid of all current threads available, and then using
> trace_sched_fork to find out what we missed (such as threads that were
> added and removed.) I think this approach will be robust and not
> affect any of the existing ftrace users (since we can make it a print
> option). Thoughts?
>
> Update: actually I find that someone already wrote an out-of-tree
> patch for Android long time back during 3.10 kernel days that added a
> 'print-tgid' option that saved tgid in the trace_save_cmdline function
> and then find it later during output time:
> https://android.googlesource.com/kernel/msm.git/+/0c5363f6a0a89952b0d0072ff4e4c3bd042add9a%5E%21/
> It will be nice if we can have a more upstream ready version of this
> patch or some other approach that you think works best. Thanks!

Well, make that patch upstream friendly and post it to LMKL (Ccing me
of course). And we'll talk about that.

-- Steve