Re: Deletion of big files & internal data structure musings

David Kastrup (dak@neuroinformatik.ruhr-uni-bochum.de)
Tue, 18 May 1999 12:51:26 +0200 (MET DST)


I think this is just another example of a phenomenon that is pervasive
to a lot of Linux internals: keeping data structures in good order.

In general, I would think it a good idea if the kernel were to use
some general data structures for searching purposes (maps inode->
file, maps memory-address to segment, whatever). These search
structures usually become suboptimal when insertions and deletions
occur, and some cleanup becomes necessary. But the cleanup need not
occur synchronously, if things are done correctly. Usually some
suboptimality can be allowed to accumulate and be cleaned up silently
when a CPU would otherwise become idle. Only under continuous CPU
load might a situation appear where the amount of disorder makes a
forced synchronous cleanup feasible.

If done correctly, a single cleanup queue can cater for all
asynchronous data structure tuning. Any CPU before becoming idle
would try to first help out with cleanup tasks.

A similar approach could be followed with block
allocation/deallocation strategies on file systems, although the
details here will differ.

Interesting appropriate data structures facilitating parallel
multi-processor searches and asynchronous cleanups are, for example,
described in the papers

http://www-lsi.upc.es/dept/techreps/ps/R98-12.ps.gz

and

http://www-lsi.upc.es/dept/techreps/ps/R98-13.ps.gz

Of course, these ideas extend to a bit more than just deleting files.
The gist is clear: try to separate operations into things that need
to be done immediately, and things that can be done asynchronously
without incurring too large an overall penalty.

The deletion of files is only one operation where this is an obvious
advantage.

There is always the possibility of locking into synchronous mode for a
request when one fears that otherwise the disorder will become too
large, but it should probably be not the default for quite a few
operations.

David Kastrup Phone: +49-234-700-5570
Email: dak@neuroinformatik.ruhr-uni-bochum.de Fax: +49-234-709-4209
Institut für Neuroinformatik, Universitätsstr. 150, 44780 Bochum, Germany

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/