Re: Fixing MIPS delay slot emulation weakness?

From: Rich Felker
Date: Sat Dec 15 2018 - 21:41:14 EST


On Sun, Dec 16, 2018 at 02:15:38AM +0000, Maciej W. Rozycki wrote:
> On Sat, 15 Dec 2018, Rich Felker wrote:
>
> > > A possibly nicer way to accomplish more or less the same thing would
> > > be to allocate the area with _install_special_mapping() and arrange to
> > > keep a reference to the struct page around.
> > >
> > > The really nice but less compatible fix would be to let processes or
> > > even the whole system opt out by promising not to put anything in FPU
> > > branch delay slots, of course.
> >
> > As I noted on Twitter when Mudge brought this topic back up, there's a
> > much more compatible, elegant, and safe fix possible that does not
> > involve any W+X memory. Emulate the delay slot in kernel-space. This
> > is trivial to do safely for pretty much everything but loads/stores.
>
> I think "trivial" is an understatement, you at least need to decode the
> delay-slot instruction enough to tell privileged and user instructions
> apart and send SIGILL where appropriate. Some user instructions send
> exceptions too and you need to handle them accordingly.

I meant simply that making them safe is trivial if they're not
accessing memory, only modifying the regisster file in the signal
context. Not that emulating them is trivial.

On the other hand it might be cleaner, safer, and easier to simply
write a full mips ISA emulator, put it in the vdso, and have the
kernel immediately return-to-userspace on hitting floating point
instructions and let the emulator code there take care of it all and
then return to the normal flow of execution.

> OTOH, for things like ADDIUPC you need to interpret the instruction
> anyway, as the value of the PC used for calculation will be wrong except
> in the original location.

Indeed. Assuming arbitrary ISA extensions including stuff that does
PC-relative arithmetic, there's no way to execute it out-of-place
without knowing how to interpret it.

> > For loads/stores, where you want them to execute with user privilege
> > level, what you do is compute the effective address in kernel-space,
> > then return to a fixed instruction in the vdso page that performs a
> > generic load/store using the register the kernel put the effective
> > address result in, then restores registers off the stack and jumps to
> > the branch destination.
>
> What about all the odd and especially vendor-specific load/store
> instructions like ASET, SAA or SWAPW? Would we need to have all the
> possible encodings provided in the VDSO?

Can all kinds of weird stuff like this go in delay slots? I'm more
familiar with SH delay slots where lots of instructions are
slot-illegal. If so perhaps the full-emulator-in-userspace approach is
better.

Rich