Re: [PATCH v3 7/7] x86/boot: Check that there are no runtime relocations

From: Fangrui Song
Date: Tue Jun 30 2020 - 18:06:46 EST


* Ard Biesheuvel
On Tue, 30 Jun 2020 at 01:34, Fangrui Song <maskray@xxxxxxxxxx> wrote:
>
> On 2020-06-29, Ard Biesheuvel wrote:
> >On Mon, 29 Jun 2020 at 19:37, Fangrui Song <maskray@xxxxxxxxxx> wrote:
> >>
> >> On 2020-06-29, Arvind Sankar wrote:
> >> >On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote:
> >> >> On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote:
> >> >> > On Mon, 29 Jun 2020 at 18:09, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> >> >> > >
> >> >> > > On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote:
> >> >> > > > Add a linker script check that there are no runtime relocations, and
> >> >> > > > remove the old one that tries to check via looking for specially-named
> >> >> > > > sections in the object files.
> >> >> > > >
> >> >> > > > Drop the tests for -fPIE compiler option and -pie linker option, as they
> >> >> > > > are available in all supported gcc and binutils versions (as well as
> >> >> > > > clang and lld).
> >> >> > > >
> >> >> > > > Signed-off-by: Arvind Sankar <nivedita@xxxxxxxxxxxx>
> >> >> > > > Reviewed-by: Ard Biesheuvel <ardb@xxxxxxxxxx>
> >> >> > > > Reviewed-by: Fangrui Song <maskray@xxxxxxxxxx>
> >> >> > > > ---
> >> >> > > > arch/x86/boot/compressed/Makefile | 28 +++-----------------------
> >> >> > > > arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
> >> >> > > > 2 files changed, 11 insertions(+), 25 deletions(-)
> >> >> > >
> >> >> > > Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>
> >> >> > >
> >> >> > > question below ...
> >> >> > >
> >> >> > > > diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
> >> >> > > > index a4a4a59a2628..a78510046eec 100644
> >> >> > > > --- a/arch/x86/boot/compressed/vmlinux.lds.S
> >> >> > > > +++ b/arch/x86/boot/compressed/vmlinux.lds.S
> >> >> > > > @@ -42,6 +42,12 @@ SECTIONS
> >> >> > > > *(.rodata.*)
> >> >> > > > _erodata = . ;
> >> >> > > > }
> >> >> > > > + .rel.dyn : {
> >> >> > > > + *(.rel.*)
> >> >> > > > + }
> >> >> > > > + .rela.dyn : {
> >> >> > > > + *(.rela.*)
> >> >> > > > + }
> >> >> > > > .got : {
> >> >> > > > *(.got)
> >> >> > > > }
> >> >> > >
> >> >> > > Should these be marked (INFO) as well?
> >> >> > >
> >> >> >
> >> >> > Given that sections marked as (INFO) will still be emitted into the
> >> >> > ELF image, it does not really make a difference to do this for zero
> >> >> > sized sections.
> >> >>
> >> >> Oh, I misunderstood -- I though they were _not_ emitted; I see now what
> >> >> you said was not allocated. So, disk space used for the .got.plt case,
> >> >> but not memory space used. Sorry for the confusion!
> >> >>
> >> >> -Kees
> >>
> >> About output section type (INFO):
> >> https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type
> >> says "These type names are supported for backward compatibility, and are
> >> rarely used."
> >>
> >> If all input section don't have the SHF_ALLOC flag, the output section
> >> will not have this flag as well. This type is not useful...
> >>
> >> If .got and .got.plt were used, they should be considered dynamic
> >> relocations which should be part of the loadable image. So they should
> >> have the SHF_ALLOC flag. (INFO) will not be applicable anyway.
> >>
> >
> >I don't care deeply either way, but Kees indicated that he would like
> >to get rid of the 24 bytes of .got.plt magic entries that we have no
> >need for.
> >
> >In fact, a lot of this mangling is caused by the fact that the linker
> >is creating a relocatable binary, and assumes that it is a hosted
> >binary that is loaded by a dynamic loader. It would actually be much
> >better if the compiler and linker would take -ffreestanding into
> >account, and suppress GOT entries, PLTs, dynamic program headers for
> >shared libraries altogether.
>
> Linkers (GNU ld and LLD) don't create .got or .got.plt just because the linker
> command line has -pie or -shared. They create .got or .got.plt if there are
> specific needs.
>
> For .got.plt, if there is (1) any .plt/.iplt entry, (2) any .got.plt based
> relocation (e.g. R_X86_64_GOTPC32 on x86-64), or (3) if _GLOBAL_OFFSET_TABLE_ is
> referenced, .got.plt will be created (both GNU ld and LLD) with usually 3
> entries (for ld.so purposes).
>

This is not the case for AArch64. There, __GLOBAL_OFFSET_TABLE__ is
always emitted, along with the magic .got.plt entries, regardless of
the input.

As for the input objects - why is '#pragma GCC visibility(hidden)' not
the default for -ffreestanding builds? This suppresses any GOT entries
emitted by the compiler, but the only way to get this behavior is
through the #pragma, which is how we ended up with '-include hidden.h'
in a couple of places.

A -ffreestanding build may provide a shared object used by other
applications. For example, musl (libc)'s CFLAGS has -ffreestanding.

IOW, if the toolchain behavior was not 100% geared towards shared
executables as it is today, we would not need the hacks that we need
to apply to get a relocatable bare metal binary like we need for the
KASLR kernel.

My mere concern regarding "geared towards shared objects" is that ELF
assumes symbols of default visibility are preemptible by default.

So unfortunately it is difficult to make -fno-semantic-interposition the
default.

If (1) is not satisfied, the created .got.plt is just served as an anchor for
things that want to reference (the distance from GOT base to some point). The
linker will still reserve 3 words but the words are likely not needed.

I don't think there is a specific need for another option to teach the linker
(GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
-static (ld -no-pie))/-static-pie (-pie) already work quite well.

On 2020-06-30, Arvind Sankar wrote:
On Tue, Jun 30, 2020 at 06:26:43PM +0200, Ard Biesheuvel wrote:
On Tue, 30 Jun 2020 at 01:34, Fangrui Song <maskray@xxxxxxxxxx> wrote:

> If (1) is not satisfied, the created .got.plt is just served as an anchor for
> things that want to reference (the distance from GOT base to some point). The
> linker will still reserve 3 words but the words are likely not needed.
>
> I don't think there is a specific need for another option to teach the linker
> (GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
> -static (ld -no-pie))/-static-pie (-pie) already work quite well.

You mean 'ld -static -pie' right? That seems to work. Is that a recent
invention?

gcc -static-pie is fairly recent [0], but it just influences how the
linker is invoked AFAIK (at least for gcc) -- in addition to passing
some linker flags, it will change what startup files get linked in (for
non-freestanding). It does not even imply -fPIE to the compiler, which
is confusing as hell. It _would_ be nice if this also told the compiler
that all symbols (perhaps unless explicitly marked) will be resolved at
static link time, so there is no need to use the GOT or PLT for globals.

As it stands, the executable can still have relocations, GOT and PLT, it
just needs to have startup code to handle them (provided by libc
typically) instead of relying on an external dynamic linker.

If the executable is purely static, it does not need to have PLT. All
calls to a PLT can be redirected to the function itself. Some range
extension thunks (other terms: stub groups, veneers, etc) may still be
needed if the distance is too large.

There are cases where a GOT cannot be avoided, e.g.

extern char foo[] __attribute__((weak, visibility("hidden")));
char *fun() { return foo; }

If foo is a SHN_ABS, `movq foo@GOTPCREL(%rip), %rax` can't be optimized
by GOTPCRELX (https://sourceware.org/bugzilla/show_bug.cgi?id=25749 binutils>=2.35 will be good)
Many other architectures don't even have a GOT optimization.

I don't think it's really relevant for the kernel build -- all we get is
ld -static --no-dynamic-linker, all -static does is prevent searching
shared libraries, and we already pass --no-dynamic-linker if it's
supported.

[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81498