Re: [PATCH] console UTF-8 fixes

From: Pavel Machek
Date: Wed Apr 11 2007 - 15:48:55 EST


Hi!

> I hope you like it. :)

Well, more or less... but you need signed-off-by line, and

> @@ -70,6 +70,16 @@
> * malformed UTF sequences represented as sequences of replacement glyphs,
> * original codes or '?' as a last resort if replacement glyph is undefined
> * by Adam Tla/lka <atlka@xxxxxxxxx>, Aug 2006
> + *
> + * More robust UTF-8 decoder. Make it work on malformed sequences as Markus Kuhn's
> + * UTF-8 decoder stress test suggests. Emit a U+FFFD on illegal sequences as well
> + * as for invalid Unicode code points.
> + * If U+FFFD is not available in the font, print an inverse question mark instead.
> + * Display an inverted dot for valid characters that are not available in the font.
> + * Do not print zero-width characters, pad double-width characters with an extra
> + * space so that the cursor moves by zero/two positions in these cases.
> + * 6 April 2007, Egmont Koblinger <egmont@xxxxxxxxxxx>,
> + * using Markus Kuhn's wcwidth() implementation.
> */

We no longer put changelogs in code.

> +/* wcwidth() based on the implementation by
> + * Markus Kuhn -- 2003-05-20 (Unicode 4.0)
> + * Latest version: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
> + */
> +struct interval {
> + int first;
> + int last;
> +};
> +
> +static int bisearch(long ucs, const struct interval *table, int max) {
> + int min = 0;
> + int mid;
> +
> + if (ucs < table[0].first || ucs > table[max].last)
> + return 0;

...and you really need to read coding style.

> + while (max >= min) {
> + mid = (min + max) / 2;
> + if (ucs > table[mid].last)
> + min = mid + 1;
> + else if (ucs < table[mid].first)
> + max = mid - 1;
> + else
> + return 1;
> + }
> +
> + return 0;
> +}

(Don't we already have rbtrees handling this just fine?)

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/