Re: BUG: arm64: missing build-id from vmlinux

From: Justin Forbes
Date: Wed Jan 11 2023 - 12:07:05 EST


On Sat, Dec 24, 2022 at 8:17 PM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
>
> On Thu, Dec 22, 2022 at 8:53 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> >
> > On Wed, 21 Dec 2022 at 17:29, Thorsten Leemhuis
> > <regressions@xxxxxxxxxxxxx> wrote:
> > >
> > > On 21.12.22 16:39, Masahiro Yamada wrote:
> > > > On Wed, Dec 21, 2022 at 5:23 PM Thorsten Leemhuis
> > > > <regressions@xxxxxxxxxxxxx> wrote:
> > > >>
> > > >> Hi, this is your Linux kernel regression tracker. CCing the regression
> > > >> mailing list, as it should be in the loop for all regressions:
> > > >> https://docs.kernel.org/admin-guide/reporting-regressions.html
> > > >>
> > > >> On 18.12.22 21:51, Dennis Gilmore wrote:
> > > >>> The changes in https://lore.kernel.org/linux-arm-kernel/166783716442.32724.935158280857906499.b4-ty@xxxxxxxxxx/T/
> > > >>> result in vmlinux no longer having a build-id.
> > > >>
> > > >> FWIW, that's 994b7ac1697b ("arm64: remove special treatment for the link
> > > >> order of head.o") from Masahiro merged through Will this cycle.
> > > >>
> > > >>> At the least, this
> > > >>> causes rpm builds to fail. Reverting the patch does bring back a
> > > >>> build-id, but there may be a different way to fix the regression
> > > >>
> > > >> Makes me wonder if other distros or CIs relying on the build-id are
> > > >> broken, too.
> > > >>
> > > >> Anyway, the holiday season is upon us, hence I also wonder if it would
> > > >> be best to revert above change quickly and leave further debugging for 2023.
> > > >>
> > > >> Masahiro, Will, what's your option on this?
> > >
> > > Masahiro, many thx for looking into this.
> > >
> > > > I do not understand why you rush into the revert so quickly.
> > > > We are before -rc1.
> > > > We have 7 weeks before the 6.2 release
> > > > (assuming we will have up to -rc7).
> > > >
> > > > If we get -rc6 or -rc7 and we still do not
> > > > solve the issue, we should consider reverting it.
> > >
> > > Because it looked like a regression that makes it harder for people and
> > > CI systems to build and test mainline. To quote
> > > Documentation/process/handling-regressions.rst (
> > > https://docs.kernel.org/process/handling-regressions.html ):
> > >
> > > """
> > > * Fix regressions within two or three days, if they are critical for
> > > some reason – for example, if the issue is likely to affect many users
> > > of the kernel series in question on all or certain architectures. Note,
> > > this includes mainline, as issues like compile errors otherwise might
> > > prevent many testers or continuous integration systems from testing the
> > > series.
> > > """
> > >
> > > I suspect that other distros rely on the build-id as well. Maybe I'm
> > > wrong with that, but even if only Fedora and derivatives are effected it
> > > will annoy some people. Sure, each can apply the revert, but before that
> > > everyone affected will spend time debugging the issue first. A quick
> > > revert in mainline (with a reapply later together with a fix) thus IMHO
> > > is the most efficient approach afaics.
> > >
> >
> > Agree with Masahiro here.
> >
> > The issue seems to be caused by the fact that whichever object gets
> > linked first gets to decide the type of a section, and so the .notes
> > section will be of type NOTE if head.o gets linked first, or PROGBITS
> > otherwise. The latter PROGBITS type seems to be the result of the
> > compiler emitting .note.GNU-stack as PROGBITS rather than NOTE.
> >
> > The hunk below fixes it for me, by avoiding notes emitted as PROGBITS.
> > I'll leave it to Masahiro to decide whether this should be fixed for
> > arm64 only or for all architectures, but I suspect the latter would be
> > most appropriate.
> >
> > Note that the kernel's rpm-pkg and binrpm-pkg targets seem to be
> > unaffected by this.
>
>
> Thanks for root-causing this.
>
>
> I like to fix this for all architectures because riscv is also broken.
>
> https://lore.kernel.org/lkml/20221224192751.810363-1-masahiroy@xxxxxxxxxx/

Appreciate the patch, this does indeed fix the aarch64 issue as well
and has allowed me to drop the original revert from Fedora.

Jusitn

>
> > diff --git a/arch/arm64/include/asm/assembler.h
> > b/arch/arm64/include/asm/assembler.h
> > index 376a980f2bad08bb..10a172601fe7f53f 100644
> > --- a/arch/arm64/include/asm/assembler.h
> > +++ b/arch/arm64/include/asm/assembler.h
> > @@ -818,7 +818,7 @@ alternative_endif
> >
> > #ifdef GNU_PROPERTY_AARCH64_FEATURE_1_DEFAULT
> > .macro emit_aarch64_feature_1_and, feat=GNU_PROPERTY_AARCH64_FEATURE_1_DEFAULT
> > - .pushsection .note.gnu.property, "a"
> > + .pushsection .note.gnu.property, "a", %note
> > .align 3
> > .long 2f - 1f
> > .long 6f - 3f
>
>
> I did not fold this hunk in my patch.
>
> I compiled with CONFIG_ARM64_BTI_KERNEL=y.
>
> .note.gnu.property section in VDSO was already NOTE
> without this hunk.
>
>
>
>
>
>
>
> > diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> > index 4c13dafc98b8400f..8a8044dea71b0609 100644
> > --- a/arch/arm64/kernel/vmlinux.lds.S
> > +++ b/arch/arm64/kernel/vmlinux.lds.S
> > @@ -160,6 +160,7 @@ SECTIONS
> > /DISCARD/ : {
> > *(.interp .dynamic)
> > *(.dynsym .dynstr .hash .gnu.hash)
> > + *(.note.GNU-stack) # emitted as PROGBITS
> > }
> >
> > . = KIMAGE_VADDR;
>
>
>
> --
> Best Regards
> Masahiro Yamada