Re: Extensions to HFS filesystem

Paul H. Hargrove (hargrove@sccm.stanford.edu)
Mon, 29 Apr 1996 18:06:52 -0700 (PDT)


Raul Miller <rdm@tad.micro.umn.edu> wrote:
> Hi, I've tried to reverse-engineer UMSDOS on several occasions (and
> have always given up for lack of time). Here's a few leading
> questions from my point of view:
>
> How does HFS currently deal with multiple files with the same CNID?
The CNID is unique. HFS stores the location of the first
three contiguous pieces of the file in the directory entry and the
remainder of the allocation information in a B-tree sorted on a key
that had the CNID as the most significant part. Multiple directory
entries witht he same CNID would thus share part of the file contents
automatically but great expense would be required to keep the
allocation info in the directory entries consistent (not to mention
the fact that a genuine Mac wouldn't keep the info up-to-date and
something like Disk First Aid would complain.) The short answer is
that multiple entries w/ the same CNID are illegal.

> How does HFS deal with a file which is in use by a program, but has
> been removed from the file system?
I handle this pretty much the same way as a native Linux
filesystem. When the file is unlink()ed the directory entry is
removed from disk and the i_nlink field of the in-core inode is set to
zero (it must have been 1 before, and yes I do check that it is).
When the file is no longer in use the put_inode() callback for HFS
truncates the file to zero length, releasing the disk blocks that were
in use and removing the allocation information from the B-tree. There
is no messy hidden file or anything like that.

> If both of these are illegal under HFS, consider a special folder for
> files that don't have the proper number of references. For the case
> of multiple (hard) links, represent them using some reference into
> that directory.
I don't see that this is any better than the UMSDOS trick of
having hardlinks be symlinks to a hidden file. The potential problems
with a directory moving are reduced by keeping the files all in one
location, but I still don't like the idea of the hidden file.

> If searching on a folder is linear time (is it?), consider
> implementing something like a b-tree (or even just a hash table) using
> nested folders, as this special folder might become quite large under
> some circumstances.
Since HFS directory entries are stored in a B-tree the time to
find a file given the CNID of the directory containing it and its name
is O(log(N)) where N is the total number of files and directories on
the disk. Finding the Mth entry in a directory, however, is O(M).

> --
> Raul
----
Paul H. Hargrove All material not otherwise attributed
hargrove@sccm.stanford.edu is the opinion of the author or a typo.