Re: Unified tracing buffer

From: Steven Rostedt
Date: Mon Sep 22 2008 - 21:39:55 EST




On Mon, 22 Sep 2008, Roland Dreier wrote:

> > Because all it tells you is the ordering of the atomic increment, not of
> > the caller. The atomic increment is not related to all the other ops that
> > the code that you trace actually does in any shape or form, and so the
> > ordering of the trace doesn't actually imply anything for the ordering of
> > the operations you are tracing!
>
> This reminds me of a naive question that occurred to me while we were
> discussing this at KS. Namely, what does "ordering" mean for events?
>
> An example I'm all too familiar with is the lack of ordering of MMIO on
> big SGI systems -- if you forget an mmiowb(), then two CPUs taking a
> spinlock and doing writel() inside the spinlock and then dropping the
> spinlock (which should be enough to "order" things) might see the
> writel() reach the final device "out of order" because the write has to
> travel through a routed system fabric.
>
> Just like Einstein said, it really seems to me that the order of things
> depends on your frame of reference.

In my logdev tracer (see http://rostedt.homelinux.com/logdev) I used an
atomic counter to keep "order". But what I would say to people what this
order means, is that order is among multiple traces between multiple CPUS.
That is if you have.

CPU 1 CPU 2
trace_point_a trace_point_c
trace_point_b trace_point_d

If you see in the trace:

trace_point_a
trace_point_c

You really do not know which happened first. Simply because trace_point_c
could have been hit first, but for interrupts and nmis and what not,
trace_point_a could have easily been recorded first. But to me,
trace_points are more like memory barriers.

If I see:

trace_point_c
trace_point_a
trace_point_b
trace_point_d

I can assume that everything before trace_point_c happened before
everything after trace_point_a, and that all before trace_point_b happened
before trace_point_d.

One can not assume that the trace points themselves are in order. But you
can assume that the things outside the trace points are, like memory
barriers. I have found lots of race conditions with my logdev, and it was
due to this "memory barrier" likeness to be able to see the races.

Unfortunately, if you are using an out of sync TSC, you lose even the
memory barrier characteristic of the trace.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/