Re: [PATCH 1/1] perf: Set build-id using build-id header on new mmap records

From: James Clark
Date: Wed Mar 02 2022 - 11:20:08 EST




On 27/02/2022 22:50, Jiri Olsa wrote:
> On Thu, Feb 24, 2022 at 05:19:55PM +0000, James Clark wrote:
>> MMAP records that occur after the build-id header is parsed do not have
>> their build-id set even if the filename matches an entry from the
>> header. Set the build-id on these dsos as long as the MMAP record
>> doesn't have its own build-id set.
>>
>> This fixes an issue with off target analysis where the local version of
>> a dso is loaded rather than one from ~/.debug via a build-id.
>
> nice catch :)
>
>>
>> Reported-by: Denis Nikitin <denik@xxxxxxxxxxxx>
>> Signed-off-by: James Clark <james.clark@xxxxxxx>
>> ---
>> tools/perf/util/dso.h | 1 +
>> tools/perf/util/header.c | 1 +
>> tools/perf/util/map.c | 16 ++++++++++++++--
>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>> index 011da3924fc1..3a9fd4d389b5 100644
>> --- a/tools/perf/util/dso.h
>> +++ b/tools/perf/util/dso.h
>> @@ -167,6 +167,7 @@ struct dso {
>> enum dso_load_errno load_errno;
>> u8 adjust_symbols:1;
>> u8 has_build_id:1;
>> + u8 header_build_id:1;
>> u8 has_srcline:1;
>> u8 hit:1;
>> u8 annotate_warned:1;
>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>> index 6da12e522edc..571d73d4f976 100644
>> --- a/tools/perf/util/header.c
>> +++ b/tools/perf/util/header.c
>> @@ -2200,6 +2200,7 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
>>
>> build_id__init(&bid, bev->data, size);
>> dso__set_build_id(dso, &bid);
>> + dso->header_build_id = 1;
>>
>> if (dso_space != DSO_SPACE__USER) {
>> struct kmod_path m = { .name = NULL, };
>> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
>> index 1803d3887afe..4ae91e491e23 100644
>> --- a/tools/perf/util/map.c
>> +++ b/tools/perf/util/map.c
>> @@ -127,7 +127,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>
>> if (map != NULL) {
>> char newfilename[PATH_MAX];
>> - struct dso *dso;
>> + struct dso *dso, *header_bid_dso;
>> int anon, no_dso, vdso, android;
>>
>> android = is_android_lib(filename);
>> @@ -185,7 +185,19 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>
>> if (build_id__is_defined(bid))
>> dso__set_build_id(dso, bid);
>> -
>> + else {
>
> nit please add { } to the if clause as well
>
>> + /*
>> + * If the mmap event had no build ID, search for an existing dso from the
>> + * build ID header by name. Otherwise only the dso loaded at the time of
>> + * reading the header will have the build ID set and all future mmaps will
>> + * have it missing.
>> + */
>> + header_bid_dso = __dsos__find(&machine->dsos, filename, false);
>
> is this 'perf top' safe? I think dso should be added in the
> same thread, but please check and add comment why we don't
> need locking in here

Seems like there are multiple synthesize_threads_workers using the same machine->dsos object so
I think locking is needed.

At first I thought of doing this:

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 4ae91e491e23..b87b81e3d41c 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -192,7 +192,9 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
* reading the header will have the build ID set and all future mmaps will
* have it missing.
*/
+ down_read(&machine->dsos.lock);
header_bid_dso = __dsos__find(&machine->dsos, filename, false);
+ up_read(&machine->dsos.lock);
if (header_bid_dso && header_bid_dso->header_build_id) {
dso__set_build_id(dso, &header_bid_dso->bid);
dso->header_build_id = 1;

But then I was wondering why it doesn't need a write lock all the way from machine__findnew_dso_id() to
dso__put()? At the moment there are writes to the dso like dso__set_loaded(), dso->nsinfo = nsi and
dso__set_build_id(), so another thread could find the dso in a partially constructed state.

Not sure if this is an issue currently without my patch, but at least with it they would have to be found
with header_build_id already set to 1 otherwise it will mess things up.

Extending the write lock outside of machine__findnew_dso_id() is difficult because it already
releases it before it returns. Does it need to be changed so that machine__findnew_dso_id() takes all the
arguments needed to construct it inside the lock?

James

>
> thanks,
> jirka
>
>> + if (header_bid_dso && header_bid_dso->header_build_id) {
>> + dso__set_build_id(dso, &header_bid_dso->bid);
>> + dso->header_build_id = 1;
>> + }
>> + }
>> dso__put(dso);
>> }
>> return map;
>> --
>> 2.28.0
>>