[PATCH 1/2] perf tools: fix handling of zero-length symbols.

From: Chris Phlipot
Date: Sat May 07 2016 - 05:17:46 EST


This change introduces a fix to symbols__find, so that it is able to find
symbols of length zero (where start==end)

The current code has the following problem:
-The current implementation of symbols__find is unable to find any symbols
of length zero.
-The db-export framework explicitly creates zero length symbols at
locations where no symbol currently exists.

The combination of the two above behaviors results in behavior similar to
the example below.

1. addr_location is created for a sample, but symbol is unable to be
resolved.
2. db export creates an "unknown" symbol of length zero at that address
and inserts it into the dso.
3. A new sample comes in at the same address, but symbol__find is unable
to find the zero length symbol, so it is still unresolved.
4. db export sees the symbol is unresolved, and allocated a duplicate
symbol, even though it already did this in step 2.

This behavior continues every time an address without symbol information
is seen, which causes a very large number of these symbols to be
allocated.

The effect of this fix can be observed by looking at the contents of an
exported database before/after the fix
(generated with scripts/python/export-to-postgresql.py)

Ex.
BEFORE THE CHANGE:
example_db=# select count(*) from symbols;
count
--------
900213
(1 row)

example_db=# select count(*) from symbols where symbols.name='unknown';
count
--------
897355
(1 row)

example_db=# select count(*) from symbols where symbols.name!='unknown';
count
-------
2858
(1 row)

AFTER THE CHANGE:

example_db=# select count(*) from symbols;
count
-------
25217
(1 row)

example_db=# select count(*) from symbols where name='unknown';
count
-------
22359
(1 row)

example_db=# select count(*) from symbols where name!='unknown';
count
-------
2858
(1 row)

Signed-off-by: Chris Phlipot <cphlipot0@xxxxxxxxx>
---
tools/perf/util/symbol.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 415c4f6..e42bf9a 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -299,7 +299,9 @@ static struct symbol *symbols__find(struct rb_root *symbols, u64 ip)
while (n) {
struct symbol *s = rb_entry(n, struct symbol, rb_node);

- if (ip < s->start)
+ if (ip == s->start && s->start == s->end)
+ return s;
+ else if (ip < s->start)
n = n->rb_left;
else if (ip >= s->end)
n = n->rb_right;
--
2.7.4