Re: UTF-8 and case-insensitivity

From: tridge
Date: Wed Feb 18 2004 - 06:46:11 EST


Robin,

> I've read it also:
> http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx
> "The fundamental representation of text in Windows NT-based
> operating systems is UTF-16"

yep, in this thread I've been mistakenly using the term UCS-16 when I
should have said UTF-16 (ie. the variable length, 2 byte encoding).

Samba currently treats the bytes on the wire from windows as UCS-2 (a
2 byte fixed width encoding), whereas perhaps it should be treating
them as UTF-16. I should write a smbtorture test to detect the
difference and see what different versions of windows actually use.

luckily the new charset handling stuff in samba3 and samba4 will make
this easy to fix :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/