Re: [c.bailiff@E-SECURE.COM.AU: Re: IIS %c1%1c remote command execution]

From: Andries Brouwer (aeb@veritas.com)
Date: Thu Oct 19 2000 - 16:24:29 EST


On Thu, Oct 19, 2000 at 01:15:06PM -0400, Michael H. Warfield wrote:

> This is being forwarded from BugTraq where there is an ongoing
> discussion over a security hole in IIS based on it's unicode decoder.
> This particular individual is stating that several unicode decoders,
> including the one in the Linux unicode_console driver, have failed to
> adhere to certain security warnings in some RFCs. This IMPLIES that
> there is a potential security problem in all of them.
>
> Anyone familiar with this who can comment on the problem in the
> Linux unicode_console driver?

There is no problem.

More in detail:

UTF8 is a multi-byte variable length encoding.
Unicode is a 16-bit (and ISO-10646 a 31-bit) encoding of the same symbols.
It is not unusual to want to convert UTF8 to Unicode or vice versa.
There is a very simple straightforward algorithm, and it has the
property that several UTF8 strings will map to the same Unicode symbol.
It is like using 8-bit ASCII-possibly-with-parity code and converting it
to ASCII by removing the high order bit.

Such a many-to-one map can be a security risk.
If you want to test a string to make sure that it does not
contain a newline symbol or semicolon or so, and do that
by testing the string before it is converted, then you
may overlook codes that turn into the looked-for thing
after conversion.

Of course this is a triviality, and nobody will make this mistake
when the conversion does significant things to the input.
But when the input leaves ASCII unchanged (as is true both
in my masking example and in UTF8->Unicode), then people might
forget that non-ASCII may turn into ASCII.

What does this mean for the kernel?
If your mother builds a device and inserts it into the keyboard cable,
where the device must make sure that you do not type swearwords,
but she forgot about Unicode, then you can put the console keyboard
into Unicode mode and type bad text that passes through your mothers
device unnoticed.

Not many Linux users have such devices in their keyboard cable,
and utilities reading from the keyboard must therefore be prepared
for arbitrary input. It is not really meaningful to make the
kernel UTF8 to Unicode parser beep if you do not type the shortest
representation of a symbol. (On the other hand, one might argue
that in order to protect lazy programmers who copy the kernel
conversion code, the kernel conversion code had better be strict.)

Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Oct 23 2000 - 21:00:16 EST