[Ltrace-devel] Getting prototypes from debug information

Sun May 11 12:21:06 UTC 2014

The following review is about cb3f6d450f0fcc2c2d.

You addressed the *-at-variable in function prototypes, but this should
apply everywhere.  I see a number of cases like this:
+	struct arg_type_info* argument_type = NULL;

This should look like:
+	struct arg_type_info *argument_type = NULL;

+static bool get_die_numeric(uint64_t *result,
+			    Dwarf_Die *die, unsigned int attr_name)
+{
+	Dwarf_Attribute attr ;
[... much later ...]
+	if ( array_type == NULL ) {
[... later still ...]
+	if ( function_name_dup == NULL ) {

Please drop the extra spaces.

+static bool get_integer_base_type(enum arg_type *type, int byte_size,
+				  bool is_signed)
+{
+	switch (byte_size) {
+	case sizeof(char):
+		*type = ARGTYPE_CHAR;
+		return true;
+
+	case sizeof(short):
+		*type = is_signed ? ARGTYPE_SHORT : ARGTYPE_USHORT;
+		return true;
+
+	case sizeof(int):
+		*type = is_signed ? ARGTYPE_INT : ARGTYPE_UINT;
+		return true;
+
+	case sizeof(long):
+		*type = is_signed ? ARGTYPE_LONG : ARGTYPE_ULONG;
+		return true;
+
+	default:
+		return false;
+	}
+}

This won't work on 32-bit arches, where int and long are both 4 bytes
wide.  Additionally, and I didn't realize this before, this test also
assumes that the traced binary comes from the same architecture as
ltrace, which generally doesn't hold (you can trace an i386 binary with
a x86_64 ltrace).  Unfortunately, ltrace's type system doesn't make it
very easy to do this the right way.

What we would very much like to have is ARGTYPE_I32, ARGTYPE_U64, and
the like (or likely just ARGTYPE_INTEGRAL with an attribute indicating a
width, encoding, and the like).  TODO has this to say about the issue:

** Support fixed-width types
   Really we should keep everything as {u,}int{8,16,32,64} internally,
   and have long, short and others be translated to one of those
   according to architecture rules.  Maybe this could be achieved by a
   per-arch config file with typedefs such as:

   | typedef ulong = uint8_t; |

Alas, that's not how this is done currently.  So in the interim, I think
you can assume that ARGTYPE_{U,}SHORT is 2 bytes and ARGTYPE_{U,}INT is
4 bytes--that's so on all arches that ltrace supports.  The only iffy
type is the long.  Essentially for now, you can assume that 8-bytes are
ARGTYPE_LONG.  That will be wrong for long long on 32-bit arches, but
what can we do.

+	while ((die = dwfl_nextcu(dwfl, die, &bias)) != NULL) {

This doesn't seem right.  dwfl_nextcu should go through all CU's of the
whole Dwfl--i.e. if you add two Dwarf-based libraries to a process, it
would iterate the former one twice.  Or maybe I don't quite grasp how
your code works.  Anyway, it looks as if you should keep around
(probably in struct library) the Dwfl_Module that dwfl_report_elf
returns.  Then you can use either dwfl_module_nextcu to iterate just the
bits that you care about.

Alternatively, instead of dwfl_module_nextcu, you could use
dwfl_module_getdwarf and then dwarf_nextcu.  The advantage is that the
latter call tells you address_size that you can use to distinguish
between 32-bit and 64-bit processes, and determine ARGTYPE_LONG's size.
But that would only serve to determine whether we can support 8-byte
integral types.  But that's assuming ARGTYPE_LONG really is 8-byte on an
arch that has 64-bit addresses.  I'm not sure, but x32 I think is a
4-byte address architecture with 8-byte longs, so it's not a silver
bullet anyway.  This really needs the fixes cited above in TODO to be
fully correct.

+	int64_t encoding;
+	if (!get_die_numeric((uint64_t*)&encoding, die, DW_AT_encoding))

Is there a reason not to just use uint64_t for encoding?

+	if (encoding == DW_ATE_float) {
+		switch (byte_size) {
+		case sizeof(float):
+			return ARGTYPE_FLOAT;
+
+		case sizeof(double):
+			return ARGTYPE_DOUBLE;

Hm, this is similar to the above.  Again, we'll have to assume that
ARGTYPE_FLOAT is 4 bytes and ARGTYPE_DOUBLE is 8 bytes.  Luckily I think
this is the case on all arches that ltrace supports.

+		complain(&die, "member: 0x%02x", dwarf_tag(&die));

Just a tip: %#02x includes the initial 0x.

+		type_struct_add(result, member_type,
+				newly_allocated_member_type);

This can fail.

+static struct arg_type_info *get_type(int *newly_allocated_result,
+				      Dwarf_Die *type_die,
+				      struct protolib *plib,
+				      struct dict *type_dieoffset_hash)
[...]
+		dict_insert(type_dieoffset_hash, &die_offset, &result);

This as well.  I understand there's no point checking this in
CLEANUP_AND_RETURN_ERROR and when just storing an ARGTYPE_VOID, but
otherwise please do.

The biggest issue right now is the switch over sizeofs, as I very much
suspect that would break compilation on 32-bit machines.  Otherwise it's
solid and with the few nits addressed, good to go.

Thanks,
PM