Re: ext2 question

Theodore Y. Ts'o (tytso@MIT.EDU)
Tue, 18 May 1999 12:44:50 -0400 (EDT)


Date: Tue, 18 May 1999 16:21:39 +0200
From: Manfred Spraul <manfreds@colorfullife.com>

But nevertheless, I have some more ideas:
> If you do locking right with a B-tree, you don't need to serialize
> access. I've done a lot of thinking about how handle things so that
> you at most need to have three locks at any particular time in the
> tree.

Unfortunately, the '95 specs clearly state that you must _not_ wait
during mmap() sector reads. You're only allowed to wait for the sector
access, and that call must be flaged as 'high priority', 'bypass queue'.

Huh? Can you give me a pointer to the Unix 95 specs which you're
referring to? Even today there will be corner cases where mmap may have
to block while a truncate is happening on a file, for example. In
general API specs don't make that kind of implementaiton constraints, so
I'm rather surpised that Unix 95 would say anything about that.

In real life, I doubt anyone would really notice, since it's not like
any of these locks would need to be held for a long time, and it's
pretty much indistinguishable with needing to block when you need to
read in the triply indirect block, followed by blocking as you read in
the doubly-indirect block, followed by blocking as you read in the
indirect block, etc. (And as I mentioned, there will be cases where
these blocks might be locked as they're being written out by the RAID
driver, for example.)

Is it realy worth to support extents for such files?
e.g. I though about:
- most files are not sparse files.
- as long as you write sequentially, extents are used, as soon as you
start to jump, then the number of sectors in the extents are stored in
the inode, and the remaining sectors are stored with the old system.

You could do that, but then you have to deal with how you transition
between a non-sparse file and a sparse file. For example, what if at
least at first, an ill-behaved program writes 15 megabytes
contiguously. It then starts writing random access blocks past the 15
meg boundary. Now you need to somehow convert from an extent system to
an indirect block system, which means blocking as you frantically try to
create all of the necessary indirect and doubly indirect and triply
indirect blocks. It can be done, but it's not pretty.....

Sure, these sorts of corner cases won't *normally* come up in real
life. But you can sure that if we don't do something rational when they
come up, someone malicious --- say, a benchmarking company hired by
Microsoft --- will do it and embarass us.

- Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/