Re: Sanitize CPU-state when switching from virtual-8086 mode to othertask

From: halfdog
Date: Mon Dec 30 2013 - 10:55:51 EST


H. Peter Anvin wrote:
> On 12/29/2013 12:44 PM, halfdog wrote:
>> H. Peter Anvin wrote:
>>> On 12/28/2013 02:02 PM, halfdog wrote:
>>>> It seems that missing CPU-state sanitation during task
>>>> switching triggers kernel-panic. This might be related to
>>>> unhandled FPU-errors. See [1] for POC and serial console log
>>>> of OOPs. Due to missing real 32-bit x86-hardware it is not
>>>> clear, if this issue might be related to subtle differences in
>>>> virtual-8086 mode handling when inside a virtualbox guest.
>>>>
>>
>>> This oops happens inside the guest? Either way, I would be
>>> *very* skeptical of Virtualbox in this case.
>>
>>> You can run a 32-bit kernel on 64-bit hardware, you know...
>>
>> I know, but hardware was occupied with long-running simulation.
>>
>> With the initial POC, there might be a timing issue involved, with
>> different process layout, exception does not occur in swith_to but
>> sometimes on other locations.
>>
>> I created a new random-code testcase [1] , which works around that
>> problem. When booted a Debian initrd and tried id, OOPSes are
>> fired like wild but at least system does not lock up immediately.
>>
>
> Still in VirtualBox?

Yes, again: after comparing the results from initrd on real hardware
with Vbox, I'm getting to understand the timing problem involved and why
timing in VBox is different: The test program usually OOPSes when
touching FPU multiple times, otherwise, when terminated before second
FPU-interacation, it OOPSes on next invocation, stumbling over invalid
CPU state from prior invocation. With improved code, I can rather
reliably bring CPU into that state, so that next process invoked and
touching FPU/MMX-state is OOPSed. Currently searching SUID-binaries and
running UID=0 daemons, that might show interesting reaction on that
event, but only on DOS level yet, e.g. after running V2 test program
once and then connecting via SSH, this currently kills the ssh daemon
nicely.

It seems that machine lockup occurs when e.g. switch to idle task
happens at exactly the right moment, which I currently cannot trigger on
real hardware, but still working on that.

--
http://www.halfdog.net/
PGP: 156A AE98 B91F 0114 FE88 2BD8 C459 9386 feed a bee
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/