Re: A Great Idea (tm) about reimplementing NLS.

From: Lennart Sorensen
Date: Thu Jun 16 2005 - 08:40:45 EST

On Thu, Jun 16, 2005 at 01:34:13AM +0200, Lukasz Stelmach wrote:
> That's why UTF-8 is suggested. UTF-8 has been developed to "fool" the
> software that need not to be aware of unicodeness of the text it manages
> to handle it without any hickups *and* to store in the text information
> about multibyte characters.What characters exactly you do mean? NULL?
> There is no NULL byte in any UTF-8 string except the one which
> terminates it.

That is true. UTF-8 wouldn't cause any more problems than ascii already
does, such as some filesystems not allowing : and * in filenames among
other characters.

> Yes, it uses unicode. And dos codepages in short ones. To prove this
> take a vfat floppy and mount it. touch(1) a file on it that has some
> non latin1 characters. Unmount the floppy then do dd if=/dev/fd0
> of=/tmp/floppy bs=1024 count=512. While it's done take some hex
> editor/viewer and seek the latin1-complaint part of the filename
> in the floppy file (search for uppercase string). Righ above the short
> filename you'll find multibyte long one.

Well at least that seems like they did something right when they
extended FAT with VFAT. Doesn't make FAT a good filesystem, but it does
make the filename extension pretty nice, much as it is an ugly hack too.

> I've tried cd packet writing with UDF and it gives insane overhead of
> about 20%. What metadata you'd like to store for example on your
> flashdrive or a floppy disk?

The constant rewriting of the same sectors that store the FAT is really
bad for some types of flash and other removeably media (like DVD-RAM).

I hadn't noticed any big overhead in UDF, although packet writing may
add some overhead itself (I never used packet writing).

Len Sorensen
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at