Re: [PATCH v8] scripts: ftrace - move the sort-processing in ftrace_init

From: Sven Schnelle
Date: Fri Jan 21 2022 - 06:14:40 EST


Heiko Carstens <hca@xxxxxxxxxxxxx> writes:

> On Fri, Jan 21, 2022 at 10:46:36AM +0100, Sven Schnelle wrote:
>> Hi Yinan,
>>
>> Yinan Liu <yinan@xxxxxxxxxxxxxxxxx> writes:
>>
>> > When the kernel starts, the initialization of ftrace takes
>> > up a portion of the time (approximately 6~8ms) to sort mcount
>> > addresses. We can save this time by moving mcount-sorting to
>> > compile time.
>> >
>> > Signed-off-by: Yinan Liu <yinan@xxxxxxxxxxxxxxxxx>
>> > Reported-by: kernel test robot <lkp@xxxxxxxxx>
>> > Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
>> > ---
>> > kernel/trace/ftrace.c | 11 +++-
>> > scripts/Makefile | 6 +-
>> > scripts/link-vmlinux.sh | 6 +-
>> > scripts/sorttable.c | 2 +
>> > scripts/sorttable.h | 120 +++++++++++++++++++++++++++++++++++++++-
>> > 5 files changed, 137 insertions(+), 8 deletions(-)
>>
>> while i like the idea, this unfortunately breaks ftrace on s390. The
>> reason for that is that the compiler generates relocation entries for
>> all the addresses in __mcount_loc. During boot, the s390 decompressor
>> iterates through all the relocations and overwrites the nicely
>> sorted list between __start_mcount_loc and __stop_mcount_loc with
>> the unsorted list because the relocations entries are not adjusted.
>>
>> Of course we could just disable that option, but that would make us
>> different compared to x86 which i don't like. Adding code to sort the
>> relocation would of course also fix that, but i don't think it is a good
>> idea to rely on the order of relocations.
>>
>> Any thoughts how a fix could look like, and whether that could also be a
>> problem on other architectures?
>
> Sven, thanks for figuring this out. Can you confirm that reverting
> commit 72b3942a173c ("scripts: ftrace - move the sort-processing in
> ftrace_init") fixes the problem?

Yes, reverting this commit fixes it.

> This really should be addressed before rc1 is out, otherwise s390 is
> broken if somebody enables ftrace.
> Where "broken" translates to random crashes as soon as ftrace is
> enabled, which again is nowadays quite common.

I wasn't able to reproduce these crashes on my systems so far. For the
readers here, we're seeing about 10-15 systems crashing every night,
usually in the 00basic/ ftrace testcases.

In most of the case it looks like register corruption, where some random
register is or'd or parts are overwritten with 0x0004000000000000,
sometimes 0x00f4000000000000. I haven't found yts found a commit that
might cause this.

/Sven