Re: [PATCH v2] perf tools annotate: fix a crash when annotate the same symbol with 's' and 'T'

From: Li, Tianyou

Date: Sun Oct 19 2025 - 21:19:37 EST


Hi Namhyung,


On 10/19/2025 11:31 AM, Namhyung Kim wrote:
Hello,

On Fri, Oct 17, 2025 at 12:04:42AM +0800, Li, Tianyou wrote:
On 10/16/2025 11:18 PM, James Clark wrote:

On 16/10/2025 4:04 pm, Li, Tianyou wrote:
On 10/16/2025 9:06 PM, James Clark wrote:

On 16/10/2025 4:36 am, Li, Tianyou wrote:
Hi James,

Thanks for your time to review. Please see my comments inlined.

Regards,

Tianyou

On 10/16/2025 1:30 AM, James Clark wrote:

On 15/10/2025 6:20 pm, Tianyou Li wrote:
When perf report with annotation for a symbol, press
's' and 'T', then exit
the annotate browser. Once annotate the same symbol,
the annotate browser
will crash.

The browser.arch was required to be correctly updated when data type
feature was enabled by 'T'. Usually it was
initialized by symbol__annotate2
function. If a symbol has already been correctly
annotated at the first
time, it should not call the symbol__annotate2
function again, thus the
browser.arch will not get initialized. Then at the
second time to show the
annotate browser, the data type needs to be
displayed but the browser.arch
is empty.

Stack trace as below:

Perf: Segmentation fault
-------- backtrace --------
     #0 0x55d365 in ui__signal_backtrace setup.c:0
     #1 0x7f5ff1a3e930 in __restore_rt libc.so.6[3e930]
     #2 0x570f08 in arch__is perf[570f08]
     #3 0x562186 in annotate_get_insn_location perf[562186]
     #4 0x562626 in __hist_entry__get_data_type annotate.c:0
     #5 0x56476d in annotation_line__write perf[56476d]
     #6 0x54e2db in annotate_browser__write annotate.c:0
     #7 0x54d061 in ui_browser__list_head_refresh perf[54d061]
     #8 0x54dc9e in annotate_browser__refresh annotate.c:0
     #9 0x54c03d in __ui_browser__refresh browser.c:0
     #10 0x54ccf8 in ui_browser__run perf[54ccf8]
     #11 0x54eb92 in __hist_entry__tui_annotate perf[54eb92]
     #12 0x552293 in do_annotate hists.c:0
     #13 0x55941c in evsel__hists_browse hists.c:0
     #14 0x55b00f in evlist__tui_browse_hists perf[55b00f]
     #15 0x42ff02 in cmd_report perf[42ff02]
     #16 0x494008 in run_builtin perf.c:0
     #17 0x494305 in handle_internal_command perf.c:0
     #18 0x410547 in main perf[410547]
     #19 0x7f5ff1a295d0 in __libc_start_call_main libc.so.6[295d0]
     #20 0x7f5ff1a29680 in
__libc_start_main@@GLIBC_2.34 libc.so.6[29680]
     #21 0x410b75 in _start perf[410b75]

Fixes: 1d4374afd000 ("perf annotate: Add 'T' hot key
to toggle data type display")
Reviewed-by: James Clark <james.clark@xxxxxxxxxx>
Signed-off-by: Tianyou Li <tianyou.li@xxxxxxxxx>
---
  tools/perf/ui/browsers/annotate.c | 3 +++
  tools/perf/util/annotate.c        | 2 +-
  tools/perf/util/annotate.h        | 2 ++
  3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/annotate.c
b/tools/perf/ui/ browsers/annotate.c
index 8fe699f98542..3b27ef1e8490 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1161,6 +1161,9 @@ int
__hist_entry__tui_annotate(struct hist_entry *he,
struct map_symbol *ms,
              if (!annotation__has_source(notes))
                  ui__warning("Annotation has no source code.");
          }
+    } else if (evsel__get_arch(evsel, &browser.arch)) {
+        ui__error("Couldn't get architecture for
event '%s'", evsel- >name);
+        return -1;
      }
symbol_annotate() only fails for negative return values
of evsel__get_arch(), but evsel__get_arch() has at least
two positive error return values.

If symbol_annotate() is wrong and it should be != 0 like
you have, then maybe symbol_annotate() should be fixed
in another commit in the same patchset as this one.
Otherwise you have two calls to the same thing right
next to each other that handle errors differently.

Thanks James. I will give a try on handling the error
message with symbol__strerror_disassemble. I am conservative
to change the code in symbol_annotate, agreed it should be
considered in another patch. Would like to focus this
particular issue and get it fixed properly. Thanks.


Looks like there was a misunderstanding. I'm not saying that the
error is _reported_ differently, it's that the condition that
triggers the error is different.

symbol__annotate():

  err = evsel__get_arch(evsel, &arch);
  if (err < 0)
      return err;

You added:

  if (evsel__get_arch(evsel, &browser.arch))
     ...

evsel__get_arch() returns positive error values (and maybe also
negative?), so "< 0" behaves differently to "!= 0".

You either have to assume that "< 0" is correct and not change
it, but then you have to also check the return value in the same
way. Or if by doing "!= 0" you're implying that
symbol__annotate() is wrong to do "< 0", then you should fix it
now to not leave __hist_entry__tui_annotate() doing the same
thing two different ways at different times.

Thanks James. I looked at the code of symbol__annotate, and noticed
the if (err<0) statement. I did not mean to change the code in
symbol__annotate because I did not understand why it handled the
error code that way. The positive return value of evsel__get_arch
indicates some error happens, eg in arm__annotate_init, so I use the
symbol__strerror_disassemble function to handle both positive and
negative error code.

I do agree we should check the error code of evsel__get_arch, but I
am hesitate to touch the code which I am not sure the consequences.
I agree it may deserve another patch but not in this patchset if we
have clear answers on why "<0" is not correct, or we have a case to
break the current code as a evidence. Thanks.


Regards,

Tianyou

It may take a little bit of effort to follow the code and look at the
git blame to see what happened, but it's really not going to be that
hard.
Truly appreciated for your instant response, and the suggestions about
'Fixes' tag, return value handling etc. I do check the git history about the
code "<0", I still did not quite understand the reason of handling it in
that way.
Looks like I just overlooked the error handling when I factored out the
function. Please feel free to update symbol__annotate() to check != 0.


Thanks for the confirmation. Will send the patch v3 soon.



You're basically suggesting to add code that (when expanded) does this:

  if (first_run) {
     if (do_important_thing() < 0)
        return err;
  } else { // second run
     if (do_important_thing() != 0)
        return err;
  }

It's not going to help anyone who looks at it in the future. It's going
to make future refactors of evsel__get_arch() more difficult, and
without knowing why it's like that, it's possibly introducing another
bug.

I am suggesting to focus on the 'else' part. If that part of code is
correct, then we might need to consider another patch for the "<0" code. I
am eager for the answer as well.


It surely has to be consistent otherwise it doesn't make sense. And if
you sent a patch that did "< 0" I would still say "but it can return
positive errors, so the new code isn't right".
I did suggest in the beginning to not check the error at all and add a
comment saying it must succeed at that point because it's already done
once before, but that's not very defensive and it doesn't fix the other
possible bug.

Yes. I am not so sure 'must succeed' could be a right assumption, or for
safety it's better to check the error code.
Agreed. Anyway I can confirm that this patch fixed the crash.

Tested-by: Namhyung Kim <namhyung@xxxxxxxxxx>


Thanks. Will include and send the patch v3 soon.


Thanks,
Namhyung