Re: [PATCH 1/1] x86: fix text_poke

From: H. Peter Anvin
Date: Fri Apr 25 2008 - 18:42:58 EST


Mathieu Desnoyers wrote:

DWARF2 is capable of extracting information only when not optimized away
by the compiler. That's the whole point of markers : liveness is good in
this case because we make sure the variable is there, not that it
*might* be there. The latter case might be good enough for a debugger,
but not for a production system tracer.


That's what I address with the last paragraph of the email.


The builtin expect will take care to put the instructions out of the
hot paths and therefore leave them out of the icache with gcc
-freorder-blocks (in -O2). The only addition to the frequently used
icache is, in this case, the 5 bytes jump, 2 bytes mov, 2 bytes test and
2 (or 6) bytes conditional branch, for a total of 11 bytes for small
functions and 15 bytes for functions which require near jumps.

Now, if a breakpoint is too expensive, one can do exactly the same trick with a naked call instruction, with a higher icache impact in the unused case (five bytes instead of one or two). However, the key to low impact is to use the debugging information to recover state.

The runtime cost of function call is bigger than the jump. I don't see
what this buys us.

You get zero instructions and five bytes of NOP in the non-taken case.

In the taken case, you move the whole thing out of line.

(Liveness at the probe point is still possible to enforce with this technique: give gcc a "g" read constraint as part of the probe instruction. That makes gcc ensure the information is *somewhere*. The debugging information will tell you where to pick it up from. Obviously, any time liveness is enforce you suffer a potential cost.)

It could be possible to do so. However, passing a variable argument list
to a marker is rather more flexible than those inline assembly
constraints. And you are still tied to the variable names and offer no
abstraction between the kernel implementation and the conceptual name
associated to a traced variable.

"Rather more flexible?" Surely you're joking, Mr. Feynman? There is no difference, none, nada.

Furthermore, your capture stub compiler, or trace data extractor, can do any kind of mapping it pleases; so I'm utterly confused what you're talking about "still tied to variable names."

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/