Re: [RESEND PATCH 0/4] Implement File-Based optimization functionality

From: Juhyung Park
Date: Thu Nov 03 2022 - 02:11:28 EST


On 11/2/22 17:47, Christoph Hellwig wrote:
On Wed, Nov 02, 2022 at 01:30:54PM +0800, Jiaming Li wrote:
1) The host let the device know of lba range(s) of interest. Those
ranges are typically associated with a specific file. One can
obtain it from the iNode of the file and some offset calculations.

This is completely and utter madness. Files are a logic concept, that
is non-unique (reflinks, snapshot) and can change at any time
(defragmentation, GC, dedup). Whoever came up with this scheme is on
crack and the it has no business being in the Linux kernel

NAK.



Is the idea really an utter madness? Majority of regular files that may be of interest from the perspective of UFS aren't reflinked or snapshotted (let alone the lack of support from ext4 or f2fs).

Device-side fragmentation is a real issue [1] and it makes more than enough sense to defrag LBAs of interests to improve performance. This was long overdue, unless the block interface itself changes somehow.

The question is how to implement it correctly without creating a mess with mismatched/outdated LBAs as you've mentioned, preferably through file-system's integration: If the LBAs in questions are indeed reflinked, how do we handle it?, If the LBAs are moved/invalidated from defrag or GC, how do we make sure that UFS is up-to-date?, etc.

>
> From: lijiaming3 <lijiaming3@xxxxxxxxxx>
>
> add fbo analysis and defrag function
>
> We can send LBA info to the device as a comma separated string. Each
> adjacent pair represents a range:<open-lba>,<close-lba>.
> e.g. The LBA range of the file is 0x1234,0x3456;0x4567,0x5678
> echo 0x1234,0x3456,0x4567,0x5678 > fbo_send_lba
>

Like, ew. Why would we ever want *the userspace* to be able to manipulate this directly?

[1] https://www.usenix.org/conference/atc17/technical-sessions/presentation/hahn - Section 3.3: "For example, even if a file was not fragmented at all in the logical space (DoFL=1), if the file had a DoFP value of 0.5, the I/O throughput became only 48% of that with DoFP=0."