Re: [PATCH 4/11] LTTng-core 0.5.108 : core

From: Mathieu Desnoyers
Date: Thu Sep 14 2006 - 11:47:58 EST


* Martin Waitz (tali@xxxxxxxxxxxxxx) wrote:
> hoi :)
>
> On Wed, Sep 13, 2006 at 11:43:08PM -0400, Mathieu Desnoyers wrote:
> > +int ltt_module_register(enum ltt_module_function name, void *function,
> > + struct module *owner)
> > +{
> > + int ret = 0;
> > +
> > + /* Protect these operations by disallowing them when tracing is
> > + * active */
> > + if(ltt_traces.num_active_traces) {
> > + ret = -EBUSY;
> > + goto end;
> > + }
>
> what would happen otherwise?
> can it happen that someone enables tracing between this check and
> the rest of the function?

The ltt-statedump module was still in my FIXME list. Let's address it.

In _ltt_trace_start, we have a try_module_get(ltt_statedump_owner) to make sure
the module does not vanish when we try to call its functions. Then, in
ltt_trace_start, we have a call to ltt_statedump_functor(trace), which uses
functions in ltt-statedump (the code of this module is stripped from
lttng-core). However, if ltt-statedump is not loaded when the try_module_get
is first done, and only then is the functor is called, the module could
vanish while the functor is executed.

Note that the ltt_traces.num_active_traces modifications only happen when the
ltt_traces_sem mutex is taken.

However, this scenario is possible (which is bad) :

CPU 0 CPU 1
ltt_trace_start ltt_trace_stop
_ltt_trace_start : inc num_active_traces
_ltt_trace_stop : dec
num_active_traces
call statedump functor
unload ltt-statedump

A possible solution to this problem would be to increment the ltt-statedump
reference count just around the function call :
in ltt_trace_start :

if(!try_module_get(ltt_statedump_owner)) {
error handling...
} else {
ltt_statedump_functor(trace);
module_put(ltt_statedump_owner);
}

And inside the ltt-statedump module, we need to make sure the kthread that is
created exits before the functor returns. The side effect of this approach is
that the lttctl control program will block until the end of state dump upon
trace start. As it is not a problem, I will fix the code accordingly : remove
the kthread.

Now for the ltt-filter module (which is not implemented yet) : as its function
pointer will be called in the tracing critical path, we cannot afford to
increment the module reference counter each time. This is why we have to combine
making sure that the num_active_traces count is 0 and waiting for each tracing
code to finish (all uses of ltt-filter are surrounded by preempt_disable : the
waiting is done by synchronize_sched() while the ltt_traces_sem is taken, so no
one can increment the num_active_traces).

The ltt-filter-control module (which is not implemented yet) is not a problem :
it is surrounded by try_module_get/module_put and does not depend on the traces
being active or not.

The ltt-relay module is also protected by a mechanism similar to ltt-filter. It
is necessary because the function pointer is on the critical path.

So, to answer your question : the if(ltt_traces.num_active_traces) check is
only necessary to make sure that the filter module is not unloaded when a
trace is active. However, adding a simple synchronize_sched() to
ltt_module_unregister will fix this issue and add the ability to
register/unregister a filter module when tracing is active.

I will remove this superfluous check in the next version.


> > + new_trace->transport = transport;
> > + new_trace->ops = &transport->ops;
> > +
> > + err = -new_trace->ops->create_dirs(new_trace);
> ^ typo or intentional?
>
>

ltt_trace_create has a stardard of positive error values when
ltt_relay_create_dirs has negative error values, without any need to do so. Good
catch :) It will be fixed in the next version.


Thanks for the comments!

Mathieu


OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

Attachment: signature.asc
Description: Digital signature