Re: [PATCH v2] fat: editions to support fat_fallocate()

From: Namjae Jeon
Date: Mon Oct 22 2012 - 11:09:59 EST


2012/10/22, OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>:
> Namjae Jeon <linkinjeon@xxxxxxxxx> writes:
>
>>> The expectation of fallocate() is just for space reservation? If it was
>>> just for space reservation, I'm not sure, why TV applications can't
>>> reserve in userland without any kernel help (I wonder who interrupts TV
>>> application). I feel a bit, it may be more lightweight than fallocate(),
>>> and more reliable than out of spec fallocate().
>>>
>>> I'm still not sure why apps really want fallocate() on FAT.
>> Yes, it is for user space reservation.
>>
>>>From the application perspective it is needed to realize in advance
>> how much space is needed for that file write â so the requirement is
>> precisely that the space reserved is entirely for me and no other I/O
>> operation in that time should consume the space.
>>
>> Of course, as you said, space can be pre-allocated from user space by
>> doing expanding truncate.
>> Main drawbacks for reserving through USER space:
>> 1) If we need to allocate 1GB space -> seek (1GB) and write -> it will
>> ZEROUT the 1GB area (which is very time consuming) just for reserving
>> space.
>> 2) The Application must always be aware of the SEEK OFFSET - otherwise
>> the APPEND WRITE will never occur and file is closed/opened (Append
>> mode) again for writing â it will instead start writing from the end
>> of file which is past the reserved space. So, this will also result in
>> losing space in such case â if application is not keeping track of
>> OFFSET
>> 3) If suppose from user space we are doing expanding truncate of 1GB
>> and suppose it fails after 256MB of allocation - in that case it did
>> allocation of 256MB blocks - did ZEROUT for all these blocks and then
>> returned failure - which is not optimal for just allocation of space.
>>
>> While if we make use of FALLOCATE
>> 1) It allows reserving the space in advance without any delay.
>> 2) Since, the space is reserved in advance. So, if suppose space is
>> reserved for 1hour TV recording than any other application in the
>> background cannot cause recording to fail in case of âno free spaceâ
>> left as it already pre-allocated space. Only other Applications will
>> close.
>> 3) It allows for APPEND write to continue smoothly without actually
>> keeping track of the file state, offset.
>> 4) Initially when the disk is not fragmented. It allows the
>> possibility to get contiguous blocks and thus reducing fragmentation
>> for that file.
>
> OK.
>
> Should TV recorder make sure it is reserving space with fallocate() for
> each open() (or first open() after mount())?
It is only for first open.

>What fsck is going to do?
> Or how to know fallocated space or corrupted space?
fsck does not know about fallocated space and considers it corrupted
space due to mismatch between file size and disk usage. So it will
free up the allocated clusters, just like windows driver.

Fsck output for a 100MB prealloacted file.
---------------------------------------------------------------------------------------------------
# fsck.vfat -aw /dev/sdb3
dosfsck 3.0.12, 29 Oct 2011, FAT32, LFN
/falloc_file
Bad short file name ( \000\005\016\013\032\022\013./\000\000).
Auto-renaming it.
Renamed to FSCK0000.001
/falloc_file
File size is 6 bytes, cluster chain length is > 4096 bytes.
Truncating file to 6 bytes.
Free cluster summary wrong (2541163 vs. really 2566762)
Auto-correcting.
Performing changes.
/dev/sdb3: 5 files, 50565/2617327 clusters
--------------------------------------------------------------------------------------------------------

>
> Does this break the linux fat driver doesn't know about this
> fallocate()? If so, it sounds like to be easy to break existent
> drivers.
Yes, it will break linux drivers without fallocate support. When we
try to write to fallocated file using old drivers, it will cause write
error and make FS read-only.
When fallocate was implemented in other filesystem, maybe,, was there
similar issue and concern ?

Thanks OGAWA!
>
> Thanks.
> --
> OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/