Re: sys_write() racy for multi-threaded append?

From: Michael K. Edwards
Date: Wed Mar 14 2007 - 16:10:13 EST


Thanks Alan, this is really helpful -- I'll see if I can figure out
kprobes. It is not immediately obvious to me how to use them to trace
function calls in userspace, but I'll read up.

On 3/13/07, Alan Cox <alan@xxxxxxxxxxxxxxxxxxx> wrote:
> But on that note -- do you have any idea how one might get ltrace to
> work on a multi-threaded program, or how one might enhance it to
> instrument function calls from one shared library to another? Or

I don't know a vast amount about ARM ELF user space so no.

This isn't ARM-specific, although it's certainly an ELF issue. ltrace
uses libelf to inspect the program binary and replace calls to shared
library functions with breakpoints. When the program hits one of
these breakpoints, ltrace inspects the function arguments, puts the
function call back in place of the breakpoint, single steps through
it, puts the breakpoint back, does some sort of sleight of hand to
ensure that it will break again on function return, and continues.
When it breaks again on function return, ltrace inspects the return
value and then continues.

I suppose I should wrap my head around the kwatch stuff and see if
there's a way of doing this without intrusive manipulation of the text
segment, which seems to be the thing preventing ltrace from being
extensible to multi-threaded programs. The ELF handling code in
ltrace is sufficiently opaque to me that I doubt I could make it work
on function calls inside shared libraries without completely rewriting
it -- at which point I'd rather go the extra mile with kprobes, I
think.

> better yet, can you advise me on how to induce gdbserver to stream
> traces of library/syscall entry/exits for all the threads in a
> process? And then how to cram it down into the kernel so I don't take

One way to do this is to use kprobes which will do exactly what you want
as it builds a kernel module. Doesn't currently run on ARM afaik.

There doesn't seem to be an arch/arm/kernel/kprobes.c, and I'm not
sure I have the skill to write one. On the other hand, ltrace manages
something similar (through libelf), and I can trace ltrace's ptrace
activity with strace, so maybe I can clone that. :-)

> the hit for an MMU context switch every time I hit a syscall or

Not easily with gdbstubs as you've got to talk to something to decide how
to log the data and proceed. If you stick it kernel side its a lot of
ugly new code and easier to port kprobes over, if you do it remotely as
gdbstubs intends it adds latencies and screws all your timings.

Sure does. I do have gdbserver working remotely but it's not as
useful as I hoped because real-world multi-threaded code is so full of
latent races that it dies instantly under gdb. Unless, of course,
that multi-threaded code was systematically developed using gdb and
qemu and other techniques of flushing out racy designs before their
implementation is entrenched. Not typical of the embedded world, I'm
sorry to say.

gdbstubs is also not terribly SMP aware and for low level work its
sometimes easier to have on gdb per processor if you can get your brain
around it.

That's a trick I don't know. What do you do, fire up the target
process, put the whole thing to sleep with a SIGSTOP, and then attach
a separate gdb to each thread after they've been migrated and locked
down to the destination CPU?

Cheers,
- Michael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/