Re: [GIT PULL] x86 setup: correct booting on 486DX4

From: Linus Torvalds
Date: Sun Nov 04 2007 - 20:11:20 EST




On Sun, 4 Nov 2007, Eric W. Biederman wrote:
>
> I do seem to recall etherboot having a far jump in that spot and it
> working on everything from a 386 on up. So I'm not certain if the
> kind of jump matters. Still the kernel has a lot more exposure.

I actually suspect you could have just about anything in there, including
just a couple of nops, or just avoiding certain instructions for a few
cycles.

The i386/i486 pipeline isn't actually all that long (I ca't find it here,
but I want to say it was just five stages), and the whole/only issue with
writing to cr0 on those CPU's is literally that there isn't any forwarding
of the cr0 state, so any instruction that actually has depend on the cr0
value needs to have that value stable in the register by the time it
executes.

So I literally suspect that just a couple of no-ops in between the move to
cr0 and any instruction that depends on the state of the PE bit would be
ok. And there aren't that many instructions that do, it's generally just
the ones that load a segment that can care.

But I'd actually be worried about a ljmp directly after the "move to
cr0", exactly because an ljump actually does have semantic dependencies on
the PE bit. But it's quite likely that ljmp is microcoded (it takes 12+
cycles even in real mode), and since microcode was nonpipelined, that
would hide it.

But "move to segment" is definitely *not* microcoded in real mode (it's
documented as just two cycles for reg->seg), so I'm not at all surprised
that "mov->cr0" followed immediately by "mov->seg" will not work.

In short:

- far jumps are in the "dangerous instruction" category after a change to
PE. I would suggest not using it, although I also suspect that it
probably works if only because it's probably microcoded on at least an
i386.

- instead of a short taken jump, you can almost certainly use anything
that is microcoded or just otherwise takes enough cycles (where
"enough" is likely in the 5-10 range) to make sure the writeback to CR0
is stable by the time any instruction uses it.

- almost anything that doesn't actually involve a segment descriptor
lookup is probably not going to care at all about the value of PE. The
PE bit really doesn't affect all that much of the x86 instruction set,
and if an instruction doesn't care, it doesn't matter whether it's
executed with the old or the new value.


Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/