Re: stop breaking dosemu (Re: x86/kconfig/32: Rename CONFIG_VM86 and default it to 'n')

From: Stas Sergeev
Date: Fri Sep 04 2015 - 17:11:34 EST


04.09.2015 22:51, Austin S Hemmelgarn ÐÐÑÐÑ:
On 2015-09-04 09:06, Stas Sergeev wrote:
04.09.2015 15:34, Austin S Hemmelgarn ÐÐÑÐÑ:
On 2015-09-04 06:46, Stas Sergeev wrote:
04.09.2015 13:09, Chuck Ebbert ÐÐÑÐÑ:
On Fri, 4 Sep 2015 00:28:04 +0300
Stas Sergeev <stsp@xxxxxxx> wrote:

03.09.2015 21:51, Austin S Hemmelgarn ÐÐÑÐÑ:
There are servers out there that have this enabled and _never_ use it
at all,
Unless I am mistaken, servers usually use special flavour of the
distro (different from desktop install), where of course this will
be disabled _compile time_.
Many (most?) distros use just one kernel for everything, because it's
just too much work to have a separate flavor for servers.
But for example menuconfig promotes CONFIG_PREEMPT_NONE for server
and CONFIG_PREEMPT for desktop. Also perhaps server would need an
lts version rather than latest.
I wonder if RHEL Server offers the generic desktop-suited kernel
with vm86() enabled?

In any case, if there is some generic mechanism to selectively
disable syscalls at run-time for server, then vm86() is of course
a good candidate. I wonder how many other syscalls are currently
run-time controlled? (those that are not marked as an "attack surface"
and defaulted to Y; I suppose the "attack surface" is currently only vm86())

OK, I think I need to clarify something here.

The attack surface of a given system refers to the number of different ways that someone could potentially attack that system. An individual syscall is not in itself an attack surface, but is part of
the attack surface for the whole system. One of the core concepts of proactive security is to minimize the attack surface, because the fewer ways someone could possibly attack you, the less likely it
is that they will succeed.

I however, referred to vm86 as a potential attack vector, which refers one way in which someone could attempt to attack the system (be it through arbitrary code execution , privilege escalation, or
some other type of exploit), note that something does not need to have a known exploit to be classified as a potential attack vector (most black hat's out there will keep quiet about discovered
exploits until they can actually make use of them themselves). By their very definition, every single site that userspace can call into the kernel is a _potential_ attack vector, including vm86().
But they are not marked as such, while vm86() is.
And they do not have a run-time disabling knob.
So why is such a big difference?
Take for example read(), this is not a very likely attack vector because:
1. It does exactly _one_ thing.
2. It only copies data to the calling process.
3. It has no odd interactions with mm.
4. The only modification it does to how the processor is executing is for the context switch to kernel mode and back to user mode.
5. It is _very_ well audited.
Overall, this means that read() is a relatively low risk.
fork() is slightly more attractive as an exploit target, because it doesn't fit points 2 and 4 above.
vm86() is much more attractive because it doesn't fit any of the 5 points above.
I agree. vm86() is a mess.
My point is that its risky parts and useless funtionality
is _already_ known (even I can point to the particular code
parts than can simply be removed). As such, it simply had
to be re-visited and cleaned up to match at least 1 and 3
(and then maybe 5). This wasn't done, and the knob was
introduced _instead_ of doing this. I am not saying the knob
should not exist. Actually the mmap_min_addr is exactly the
knob I am asking for: well justified, well known.

Other system calls that I know of that fit less than 3 of the 5 points above are: modify_ldt(), perf_event_open(), ptrace(), and bpf(). I regard all of these as potentially more attractive than vm86
Agree!
Are they marked as "attack surface" in the Kconfig or
elsewhere? Or maybe their risk is properly documented?
Or not documented at all? IMHO when security is concerned,
documenting things properly is very important.

vm86() is one of the more attractive syscalls to attempt to use as an attack vector on 32-bit x86 systems because it's relatively unaudited,
This can be changed if it is at least stripped from the known
bloat, for example. This could have been done _before_ taking any
other actions on it, because the actions would then be entirely
different. Maybe, if it is properly cleaned up, the action will
change from disabling or introducing a knob to auditing it?
If you clean it up, I'd be happy to throw every thing I can think of at it.
I'll look into doing that perhaps.
At least I can try to test it after changes.

Even if I don't manage to discover any exploits in that case, I would still advocate against having it availible by default because it's functionality that is used by an consistently decreasing percentage of users (yes, I know lots of people use dosemu, the number of people who use Linux is however going up faster than the number of people who use dosemu (no, I don't have numbers to back this up, but it is statistically very likely to be the case),
Yes, I won't challenge that expectation.
dosemu supports legacy system, so its user base is
doomed to decrease. I was only against the assumptions
that it is below 1%. In fact, if we consider the users that
run dosemu just once, type "dir" in the dos prompt, get
some nostalgie and close it forever, I guess we'll get the
surprisingly large numbers. :)

and I know a number of people who used to use it (myself included) who are moving to dosbox because the performance difference is getting less significant as computers get faster).
dosemu is not only about performance, but also about
a better HW support for which only the DOS drivers exist.
Also I wonder how well dosbox scales and performes in a
multicore setup. dosemu2 had a lot of manpower invested
into a multicore scalability. As a result, you can play Need For
Speed on an old dual-core notebook.

significantly modifies the execution state of the
processor, and is available on a majority of 32-bit x85 systems in the wild. This does not mean that it is exploitable directly, just that it's a possible target for an exploit.
So you say it is more dangerous than other syscalls, and I can
believe you, but this needs a proper justification. Someone have
to write why exactly it is more dangerous, can it be fixed or not,
etc. Like it was done for mark_screen_rdonly - I am not asking you
how it can be exploited because I take your word that this code is
a potential risk. But it can be removed. If there are other risky
parts, they also have to be identified. I simply don't think the
sufficient justification was spelled to consider it as more dangerous
than all other syscalls (modulo mmap_min_addr - that one was identified).
I've already stated _why_ it's more dangerous:
1. It interacts in odd ways with memory management.
If you mean the mark_screen_rdonly hack, then it can just
be removed.

2. It directly modifies the execution state of the processor.
Yes.

It is no more potentially dangerous than any other system call that fits either description, I'm not trying to single out vm86,
Not you, but in Kconfig it has the "attack surface" tag,
which singles it out quite a lot. This alone is a big part
of the problem. Well, for you it is just a minor detail, but
for me - it is a direct threatening of my users.

that just happens to be the syscall we are discussing right now. Another syscall that is a perfect example of both 1 and 2 would be modify_ldt, which _does_ have known exploits that required a rewrite,
So why not to rewrite (or actually just clean up) the vm86()
syscall _before_ any exploit is found? Part of that was already
done AFAIK, but there are still the things to strip.

and now has a knob to disable it because most people don't use it.
Well, a very recent knob, I didn't know about it yet.
But please, see the difference:
1. As you said, the code was first rewritten to stay in shape
2. There was no known security threats at the point of
adding the knob, so there was a much softer wording
in Kconfig (I admit it could have been even softer though).

What we have in vm86() case:
1. It is full of risky code which no one rewrites (a lie, Brian Gerst does,
but its still there)
2. The knob is added with an intention for it to always
stay disabled; the heavy wording in Kconfig just confirms
this and threatens the users.

I am not against the knob as a whole, just against how
it was done in this particular case. It have been done much
better in case of modify_ldt().

Reiterating what I've said before, albeit paraphrased:
1. If you can call code, there is a possibility that you can exploit it.
2. Just because there are no publicly documented exploits for something does not mean that it is secure.
3. Having functionality enabled by default that you don't need is a Very Bad Thing, this is why Windows has historically had so many security issues.
4. Reactive security is utterly useless for any system that has already been exploited. If you have been hacked by someone who actually knows what they are doing, then even your hardware is suspect at that point, and patching the initial entry point will not provide any reasonable degree of safety.

Also, this will be the last reply I make on this sub-thread, if this does not convince you of any of the points I've made, then nothing I can say is likely to
Its not that I disagree with this your e-mail.
I am actually agree with most. But we see the problem
from different sides. I care about my users that will be
presented with the strange knob, enabling which will allow
them to run dosemu, but will "turn their OS into an attack
surface". What will they choose? I care. But you care about
your users, so you do not share my concerns. When the
user is asked to accept some risk, IMHO he needs some
description of this risk. And he needs to know that this
risk was at least minimized as possible by the developers.
If he knows the otherwise, he'll never accept the risk and
never enable vm86(). This is what important to me, but
you have different objectives, so for you this is just a minor
detail.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/