If your environment is already glibc 2 based, there
the USER SPACE is already Large File compatible with
64-bit __off_t type, and wrapper functions over real
system calls using 32-bit "off_t".
Note that if you keep using some low-level syscalls
(open() comes to mind), then your code must be tweaked
slightly adding O_LARGEFILE flag.
Once the kernel is Large File compatible, it is mostly
a matter of updating the glibc backend interfaces to take
the advantage, while keeping the user-side view of things
intact.
--> should not be a USER problem if you already are
glibc 2.1 based (e.g. RedHat 6.0 as an example.)
( glibc version might be 2.2 at that time, though.. )
Now all that preceeding talk does not say anything about
current Linux filesystems - block indirection based schemes
alike all traditional UNIX fses have inherent metadata
access induced slowdowns at very large files. Also they
are limiting maximum filesizes at the filesystem - here
is a copy of a comment at one of my LFS test programs:
( Oh yes, the way Linus wanted to have the block index scaling
done, does mean that maximum supported filesize at 32-bit
system is "mere" 2G*512 = 1024 GB; an alternate is to scale
with system page size, which supports eight times higher
maximum offsets. The indices are signed entities to allow
an idea of cacheing i-node related metadata into negative
block indices. )
/*
* Execution of this program creates a large (looking) file at
* local system. The execution of this program stops at offset
* which is just beyond what the local filesystem can support.
*
* In case the local filesystem is EXT2 or UFS, the triply-
* indirection scheme can support up to following sizes per
* filesystem basic block size:
*
* 512 2 GB + epsilon
* 1k 16 GB + epsilon
* 2k 128 GB + epsilon
* 4k 1024 GB + epsilon
* 8k 8192 GB + epsilon ( not without PAGE_SIZE >= 8 kB )
*
* However, the basic block device layer can support only up to
* 4G blocks of 512 bytes; being at safe side, lets say 2G blocks
* of 512 bytes: 1024 GB without epsilon.
*
* The basic filesystem block size supportable at given system is
* currently dependent on what the system page cache page size is.
* If the filesystem basic block exceeds of the system page size,
* the system does not work currently. (2.2.* series kernels.)
*
*
* bb = sizeof(basic block)
* B = bb / 4
* MaxSize = bb * (B^3 + B^2 + B + 10)
* = bb^4 / 4^3 + bb^3 / 4^2 + bb^2 / 4^1 + 10 * bb
* = bb^4 / 64 + bb^3 / 16 + bb^2 / 4 + 10 * bb
*
* The primary term (bb^4/64) yields that 16 GB for bb=1k
* The 'epsilon' is all the rest, and is less than 1/250 of
* the primary term for 1kB, even less for larger bb values.
*
*
* The EXT2 (and UFS) can address up to 4G of "basic blocks", however
* that limit will likely to be beyond of other limitations from other
* parts of the system.
*
*/
> --Clem
/Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/