Re: [PATCH RFC] s390: Fix nospec table alignments

From: Heiko Carstens
Date: Sun Aug 28 2022 - 12:51:21 EST


On Sat, Aug 27, 2022 at 03:59:37PM -0700, Josh Poimboeuf wrote:
> > > While working on another s390 issue, I was getting intermittent boot
> > > failures in __nospec_revert() when it tried to access 'instr[0]'. I
> > > noticed the __nospec_call_start address ended in 'ff'. This patch
> > > seemed to fix it. I have no idea why it was (only sometimes) failing in
> > > the first place.
...
> > > + . = ALIGN(4);
> > > .nospec_call_table : {
> > > __nospec_call_start = . ;
> > > *(.s390_indirect*)
> >
...
> > Unfortunately I was unable to let any compiler generate code, that
> > would use the larl instruction. Instead the address of
> > nospec_call_table was loaded indirectly via the GOT, which again works
> > always, regardless if the table starts at an even or uneven address.
> >
> > This needs to be fixed anyway, and your patch certainly is correct.
> >
> > Could you maybe share your kernel config + compiler version, if you
> > are still able to reproduce this?
>
> I think the trick is to disable CONFIG_RELOCATABLE. When I compile with
> CONFIG_RELOCATABLE=n and "gcc version 11.3.1 20220421 (Red Hat 11.3.1-2)
> (GCC)", I get the following in nospec_init_branches():
>
> 2a8: c0 20 00 00 00 00 larl %r2,2a8 <nospec_init_branches+0x30> 2aa: R_390_PC32DBL __nospec_call_start+0x2
>
> That said, I still haven't been able to figure out how to recreate the
> program check in __nospec_revert(), even when the nospec_call_table
> starts at an odd offset.

Right, CONFIG_RELOCATABLE=n will do the trick.

I don't know why you cannot recreate it, however on my system it
crashes instantly when I make sure that __nospec_call_start starts at
an odd address.

Apparently 'instr = (u8 *) epo + *epo;' in __nospec_revert() may
result in a very large address, since without KASLR the kernel is
located at a low address, and it only takes one entry within the
incorrectly accessed nospec_call_table which results in a large
negative value for '*epo' and we end up with an overflow and a very
large address for 'instr'.
This will then result in the program check / addressing exception
you've seen when the kernel tried to access 'instr[0]'.

I'll apply your patch. Thanks a lot for debugging and reporting!