Re: NTFS-like streams?

From: Rogier Wolff (R.E.Wolff@BitWizard.nl)
Date: Mon Aug 14 2000 - 17:37:19 EST


Michael Rothwell wrote:
> Rogier Wolff wrote:
> >
> > Michael Rothwell wrote:
> > > Rogier Wolff wrote:
> > >
> > > > However, when
> > > >
> > > > myfile
> > > >
> > > > starts to refer both to a file and a directory, that's when I get
> > > > upset.
> > >
> > >
> > > It's not a file and a directory. It's a file with named streams.
> >
> > Fine. Then it's a file with named streams.
> >
> > If both
> >
> > myfile
> >
> > and
> >
> > myfile/resource
> >
> > become valid strings to pass to the "open" syscall then I get upset.
>
> Why? Programs that don't use streams won't ever try to do it, a
> nd won't even query to see if there _are_ streams on a file.

It's important that non-aware tools continue to be non-aware, but do
the "right thing" as much as possible.

With a friend I just discussed that

        word myfile.doc

is supposed to work. Even if the stupid word-for-mac put a resource
file on the document. But we'd also like

        cp myfile.doc mybackupfile.doc

to copy both the file and its resource fork. This is definitively not
possible. We're agreed that the first is a bit more important.

> > Fine now I'm using your terminology. Does that help you understand
> > what I'm trying to say? I think you're just playing dumb. That doesn't
> > help a discussion.

> I'm not playing dumb. It seems like people making the "implement as
> directories" arguments actually don't want any streams support at
> all. I'm arguing honestly: I want streams support for filesystems
> that use them. I don't want to break existing programs to get it. I
> think a namespace extension is the proper way to do it. I _think_ I
> understand what you're trying to say, and it's this:
 
> "streams are stupid and I don't want linux to support them"

No. I'm not trying to say that.

What I'm saying is that we've made dos' 11 character filenames
representable in Unix. In an intuitive way (in fact very similar to
what DOS does itself). "AUTOEXECBAT" becomes "autoexec.bat" (12
chars. Notice?). We've even turned the slash around, because that maps
easier on the way Unix guys would think of directories. There is
nothing wrong with C:\WINDOWS\WIN.INI, we just say
/w95/windows/win.ini .

Fact is, that we've thought about it a bit, and now normal Unix
utilities like "tar" and "cp" work on an MSDOS filesystem.

There is a translation involved from what's on the disk, and what we
present to the user. In the case at hand I want to make the suggestion
that keeping the "what we present to the user" reasonably unix-like.

So:
        
drwxrwxr-x 2 wolff users 1024 Aug 14 23:55 myfile/
-rw-rw-r-- 1 wolff users 8 Aug 14 23:56 myfile/default
-rw-rw-r-- 1 wolff users 16335 Aug 14 23:56 myfile/logo.gif

is a good way to show a file with named-stream-file to unix-users. Tar
understands what to do with this. If you shove this onto an ext2fs,
you won't notice the difference. Good.

If you add the convenience that

        open ("myfile", O_RDONLY);

doesn't return EISDIR, but just opens myfile/default instead, then
"the you get the datafork if you forget about the named streams" also
works.

You can also do:

-rw-rw-r-- 1 wolff users 8 Aug 14 23:56 myfile
-rw-rw-r-- 1 wolff users 16335 Aug 14 23:56 myfile:logo.gif

Tar still works. You're in trouble a bit when the filesystem specifies
a file that contains a ":". These will be scarce in both Apple and NT
filesystems if I'm not mistaken. Anyway, if you DO encounter them, I'd
suggest that the file that apple would call

        myfile:backup
        6d 79 66 69 6c 65 3a 62 61 63 6b 75 70

we'd see this file as:

        myfile%3abackup
        6d 79 66 69 6c 65 25 33 61 62 61 63 6b 75 70

and the apple file

        thesis 100% done

would become

        thesis 100%2a done
        74 68 65 73 69 73 20 31 30 30 2a 32 61 20 64 6f 6e 65

Slightly ugly, but fortunately, cp, tar, the shell, every Unix utility
knows how to handle filenames that have odd characters in them.

I'm also cool with:

lrwxrwxrwx 1 wolff users 13 Aug 15 00:07 my_file -> .my_file_data
-rw-rw-r-- 1 wolff users 8 Aug 14 23:56 .my_file_data
-rw-rw-r-- 1 wolff users 16335 Aug 14 23:56 .my_file_logo.gif

Now you may have to "mangle" filenames that start with a ".". If you
think that's more or less likely than "%" or ":" in a filename, that
can be a valid reason to chose one or the other method.

There are tons of ways of representing the different forks in a way
that normal unix-utilities don't notice anything "odd".

But

-rw-rw-r-- 1 wolff users 8 Aug 14 23:56 myfile
drwxrwxr-x 2 wolff users 1024 Aug 14 23:55 myfile/
-rw-rw-r-- 1 wolff users 16335 Aug 14 23:56 myfile/logo.gif

is a good way to break way too many unix-semantics-expecting programs.

Linus, you can change the world. You have. But whenever possible, try
to map the "named streams" onto something that looks sensible to
classical Unix utilities. We can change "tar". You're losing the
"logo.gif" file if you untar this onto a non-named-streams supporting
filesystem.

Why is the Unix paradigm "everything is a stream of bytes" so good?
cp doesn't know wether it's copying ascii or binary data. cp doesn't
know wether it's copying from a file or a device. So with a few tools
that know how to handle "byte streams" you can make very powerful
scripts.

Similarly we've learned "tar" and "cp -r" how files and directories
work. I think we shouldn't yank the assumptions that they are making
from under them. Especially when there are workable alternatives.
So, I'm saying that "everything is a bytestream" should be expanded
to "everything is a normal directory hierarchy".

                                Roger.

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
*       Common sense is the collection of                                *
******  prejudices acquired by age eighteen.   -- Albert Einstein ********

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Aug 15 2000 - 21:00:34 EST