[PATCH] nls: remove "support" for 5- and 6-byte UTF-8 sequences

From: Alexey Dobriyan
Date: Mon Jul 28 2014 - 17:26:56 EST


5- and 6-byte UTF-8 sequences existed to accomodate code points starting
from U+200000.

Those code points couldn't fit into old 2-byte wchar_t type.

Commit 74675a58507e769beee7d949dbed788af3c4139d added 16+ bit Unicode support
but started (correctly) to check for maximum valid code point
which is U+10FFFF.

U+10FFFF decodes into just 4-byte sequence.

5- and 6- byte UTF-8 sequences were made illegal long ago in
Unicode standard and never had a chance to be used in Linux,
so put them out of misery.

Signed-off-by: Alexey Dobriyan <adobriyan@xxxxxxxxx>
---

fs/nls/nls_base.c | 2 --
1 file changed, 2 deletions(-)

--- a/fs/nls/nls_base.c
+++ b/fs/nls/nls_base.c
@@ -39,8 +39,6 @@ static const struct utf8_table utf8_table[] =
{0xE0, 0xC0, 1*6, 0x7FF, 0x80, /* 2 byte sequence */},
{0xF0, 0xE0, 2*6, 0xFFFF, 0x800, /* 3 byte sequence */},
{0xF8, 0xF0, 3*6, 0x1FFFFF, 0x10000, /* 4 byte sequence */},
- {0xFC, 0xF8, 4*6, 0x3FFFFFF, 0x200000, /* 5 byte sequence */},
- {0xFE, 0xFC, 5*6, 0x7FFFFFFF, 0x4000000, /* 6 byte sequence */},
{0, /* end of table */}
};

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/