Re: tracer_init_tracefs really slow

From: Lucas Stach
Date: Mon Dec 07 2020 - 11:25:58 EST


Hi Steven,

Am Donnerstag, den 03.12.2020, 21:18 -0500 schrieb Steven Rostedt:
> Sorry for the really late reply, but I received this while I was on
> vacation, and my backlog was so big when I got back that I left most of
> it unread. :-/ And to make matters worse, my out-of-office script
> wasn't working, to let people know I was on vacation.

No problem, I already figured that this might have fallen through the
cracks. It's also not really a high prio issue for us.

> On Mon, 07 Sep 2020 18:16:52 +0200
> Lucas Stach <l.stach@xxxxxxxxxxxxxx> wrote:
>
> > Hi all,
> >
> > one of my colleagues has taken a look at device boot times and stumbled
> > across a pretty big amount of kernel boot time being spent in
> > tracer_init_tracefs(). On this particular i.MX6Q based device the
> > kernel spends more than 1 second in this function, which is a
> > significant amount of the overall kernel inititalization time. While
> > this machine is no rocket with its Cortex A9 @ 800MHz, the amount of
> > CPU time being used there is pretty irritating.
> >
> > Specifically the issue lies within trace_event_eval_update where ~1100
> > trace_event_calls get updated with ~500 trace_eval_maps. I haven't had
> > a chance yet to dig any deeper or try to understand more of what's
> > going on there, but I wanted to get the issue out there in case anyone
> > has some cycles to spare to help us along.
>
> OK, that makes sense. The macro TRACE_DEFINE_ENUM() will make a mapping
> of enums into their values. This is needed because if an enum is used
> in tp_printk() of a TRACE_EVENT(), the name of the ENUM is passed to
> user space. The enum name is useless to user space, so this function
> will scan the strings that are exported to user space and convert the
> enum name to the enum values.
>
> >
> > The obvious questions for now are:
> > 1. Why is this function so damn expensive (at least on this whimpy ARM
> > machine)? and
>
> Well, it's doing a string substitution for thousands of events.
>
>
> > 2. Could any of this be done asynchronously, to not block the kernel in
> > early init?
>
> Yes :-)
>
> We could make a thread that does this, that the init wakes up and runs,
> letting the kernel to move forward. Would you like to make that patch
> or shall I?

I guess you are much more likely to come up with a correct patch, as
I'm not really clear yet on when we would need to synchronize this
thread, to make sure things are available before they get used by
something. I likely won't have time in the near future to read enough
code in this particular spot of the kernel.

I would be happy to test a patch on our whimpy machines, though. :)

Regards,
Lucas