[Ltrace-devel] Getting prototypes from debug information

Petr Machata pmachata at redhat.com
Fri May 2 12:05:10 UTC 2014


Dima Kogan <lists at dima.secretsauce.net> writes:

> First off, I'm on Debian running geeqie 1:1.1-8+b1. Here geeqie is
> multi-threaded, and I guess without -f geeqie only looks at the first
> thread. All the libjpeg calls apparently come from the second thread, so
> by default ltrace sees nothing. Running with -f makes it work better.
> This is a bit puzzling to the end-user, but I guess this is fine, since
> this is what strace does. (Does strace work this way for threads, by the
> way, or just forks?)

Basically we'd like to follow pthread_create always, and fork only if -f
is given.  ltrace now follows nothing, unless -f is given, and then it
follows everything.  (Really it follows everything alway and detaches
from newly-created children unless -f is given.)

The thing is, in Linux, we can choose to follow only {v,}forks by
setting PTRACE_O_TRACE{V,}FORK.  We can also choose to follow all clones
by setting PTRACE_O_TRACECLONE, but that captures pthread_create as well
as fork (as all these are built on top of the underlying clone system
call), as well as direct clone calls.

So what would make sense would be to tweak the current logic to only
detach if what happened was an actual fork, which we can tell from the
parameters of the system call.  That might provide a more useful user
experience.  Tracing only a single thread is problematic anyway, because
_all_ the threads will hit the breakpoints that ltrace sets anyway, so
pre-emptively tracing all threads is what you generally need.

I'll add that to the TODO.  I'll work more systematically on ltrace
later this year.  I'd like to add support for systemtap probes, and have
a couple more minor things to do as well, so I'll roll this into that
block of work.

> The command at this point is
>
>  ./ltrace -l 'libjpeg.so*' /usr/bin/geeqie /tmp/small.jpg
>
>  
> Second, running this way reports libjpeg->libjpeg calls correctly, but
> not geeqie->ltrace [PM: presumably libjpeg?] ones. So some libjpeg
> functions still get the wrong prototypes. For instance I see this:
>  geeqie->jpeg_read_header(0x7f23f62506c0, 1, 0, 1496 <unfinished ...>
>
> I looked into this a bit. import_DWARF_prototypes() was only called on
> libjpeg.so.8. jpeg_read_header() was indeed parsed correctly, the parsed
> data just wasn't used. Were we looking in the 'geeqie' plib instead of

Hmm, that's strange, it seems to work for me in a simplified test case:

-- hle1.c --
int jedna (void);
int dva (void);
int main (int argc, char *argv[]) { jedna (); return dva (); }

-- hle2.c --
int dva (void) { return 2; }
int jedna (void) { return dva (); }

$ gcc hle1.c -g -L. -Wl,-rpath,. -lhle2
$ gcc hle2.c -g -fpic -shared -o libhle2.so
$ ~/tmp/ltrace/build/ltrace -llibhle* ./a.out -e''
a.out->jedna( <unfinished ...>
libhle2.so->dva()                                   = 2
<... jedna resumed> )                               = 2
a.out->dva()                                        = 2
+++ exited (status 2) +++

It does however break in this case:

$ ~/tmp/ltrace/build/ltrace -e at MAIN ./a.out -e''
a.out->__libc_start_main([ "./a.out", "-e" ] <unfinished ...>
a.out->jedna(2, 0x7fffb731ae28, 0x7fffb731ae40, 0x400650)    = 2
a.out->dva(2, 0x7fffb731ae28, 0x7fffb731ae40, 0x400650)      = 2
+++ exited (status 2) +++

The problem here is probably that you reject loading of libhle2.so's
debug info because it doesn't match the tracing pattern.  But the
tracing pattern applies to PLT symbols, which will be resolved by
libhle2.so, and for that reason we need to use libhle2.so's debug info.

But that still doesn't explain your problem.  Having it reduced to a
simple case like the above (instead of geeqie and libjpeg) would
possibly allow us to see the problem clearly.

> the 'libjpeg.so.8' plib? If I run with
>
>  ./ltrace -x '@libjpeg.so*' /usr/bin/geeqie /tmp/small.jpg
>
> then way more than libjpeg.so is instrumented, and the application slows
> to a crawl. Is this right? Should the options preclude this? If I do

Yeah, because -x annotates all symbols defined by libjpeg.so directly,
so you see also intra-library calls, possibly static symbols (if you
have symbols tables available), etc.

> ./ltrace -x '@libjpeg.so*' -e '@libjpeg.so*' /usr/bin/geeqie /tmp/small.jpg
>
> then jpeg_read_header works correctly. Is this a bug in the new code?

In a way.  I think the problem is that you pre-filter the debug info
loading.  You should always load the debug info, because you can never
tell whether the loaded library doesn't resolve one of the -e symbols.
Only if options.plt_filter is NULL (i.e. for -L) can we assume that the
library will never be necessary.  My mistake for not realizing this
sooner, I know you asked about this explicitly.

> Also, regarding -l, -e and -x, I feel this is more confusing than it
> needs to be. As a user, I generally want either 'show me all calls into
> this library' or 'show me all calls into this function'. I guess this is
> -l and -x, except -x ends up instrumenting way more (see above). And
> then what's -e for?

-x is "show me what calls these symbols (including local calls)"
-e is "show me what calls these symbols (inter-library calls only)"
-l is "show me what calls into this library"

-e is useful if you care about everything that certain library calls.
What ltrace does by default is -e *@MAIN--i.e. show all library calls
that the main binary makes.  -e* is "show me all inter-library calls in
the whole program".

-l works in the opposite direction--you care about calls into some
library.  It's as if you gave -e for each symbol that the given library
implements.

Thanks,
PM



More information about the Ltrace-devel mailing list