[review] fat/nls: Fix handling of utf8 invalid char

From: OGAWA Hirofumi
Date: Sat Aug 01 2009 - 15:19:50 EST


Hi,

I'm going to send this patch at next merge window. This is including the
change of fs/nls_base.c, it's trivial though. And it seems you changed
the fs/nls_base.c last time.

So, to make sure, could you review this if you have time?

Thanks.
--
OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>

commit 67638e4043083cdc6f10386a75fef87ba46eecb3
Author: OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>
Date: Sat Aug 1 21:30:31 2009 +0900

fat/nls: Fix handling of utf8 invalid char

With utf8 option, vfat allowed the duplicated filenames.

Normal nls returns -EINVAL for invalid char. But utf8s_to_utf16s()
skipped the invalid char historically.

So, this changes the utf8s_to_utf16s() directly to return -EINVAL for
invalid char, because vfat is only user of it.

mkdir /mnt/fatfs
FILENAME=`echo -ne "invalidutf8char_\\0341_endofchar"`
echo "Using filename: $FILENAME"
dd if=/dev/zero of=fatfs bs=512 count=128
mkdosfs -F 32 fatfs
mount -o loop,utf8 fatfs /mnt/fatfs
touch "/mnt/fatfs/$FILENAME"
umount /mnt/fatfs
mount -o loop,utf8 fatfs /mnt/fatfs
touch "/mnt/fatfs/$FILENAME"
ls -l /mnt/fatfs
umount /mnt/fatfs

---- And the output is:

Using filename: invalidutf8char_\0341_endofchar
128+0 records in
128+0 records out
65536 bytes (66 kB) copied, 0.000388118 s, 169 MB/s
mkdosfs 2.11 (12 Mar 2005)
total 0
-rwxr-xr-x 1 root root 0 Jun 28 19:46 invalidutf8char__endofchar
-rwxr-xr-x 1 root root 0 Jun 28 19:46 invalidutf8char__endofchar

Tested-by: Marton Balint <cus@xxxxxxxxxx>
Signed-off-by: OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>

diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index cb6e835..f565f24 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -499,17 +499,10 @@ xlate_to_uni(const unsigned char *name, int len, unsigned char *outname,
int charlen;

if (utf8) {
- int name_len = strlen(name);
-
- *outlen = utf8s_to_utf16s(name, PATH_MAX, (wchar_t *) outname);
-
- /*
- * We stripped '.'s before and set len appropriately,
- * but utf8s_to_utf16s doesn't care about len
- */
- *outlen -= (name_len - len);
-
- if (*outlen > 255)
+ *outlen = utf8s_to_utf16s(name, len, (wchar_t *)outname);
+ if (*outlen < 0)
+ return *outlen;
+ else if (*outlen > 255)
return -ENAMETOOLONG;

op = &outname[*outlen * sizeof(wchar_t)];
diff --git a/fs/nls/nls_base.c b/fs/nls/nls_base.c
index 477d37d..b25c218 100644
--- a/fs/nls/nls_base.c
+++ b/fs/nls/nls_base.c
@@ -124,10 +124,10 @@ int utf8s_to_utf16s(const u8 *s, int len, wchar_t *pwcs)
while (*s && len > 0) {
if (*s & 0x80) {
size = utf8_to_utf32(s, len, &u);
- if (size < 0) {
- /* Ignore character and move on */
- size = 1;
- } else if (u >= PLANE_SIZE) {
+ if (size < 0)
+ return -EINVAL;
+
+ if (u >= PLANE_SIZE) {
u -= PLANE_SIZE;
*op++ = (wchar_t) (SURROGATE_PAIR |
((u >> 10) & SURROGATE_BITS));
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/