Re: [PATCH v6] fat: Batched discard support for fat

From: Lukas Czerner
Date: Tue May 24 2011 - 06:44:41 EST


On Tue, 24 May 2011, OGAWA Hirofumi wrote:

> Lukas Czerner <lczerner@xxxxxxxxxx> writes:
>
> >> No, no. Userland will know max-length from statvfs, right? So, let's
> >> assume it is 100 (->f_blocks) * 1024 (->f_bsize).
> >
> > You do not need to know the filesystem size to do the discard, it should
> > be adjusted within the kernel. Just specify ULLONG_MAX as a length. See
> > fstrim tool in util-linux-ng.
> >
> >>
> >> Now, userland know about max length, 102400, ok? Let's start to trim.
> >>
> >> Assume, userland want to trim whole. So, userland will specify like
> >>
> >> trim(0, 102400).
> >>
> >> What happen in kernel actually?
> >>
> >> Current implement doesn't map blocks. So, in the case of FAT, it adjusts
> >> from 0 to 2 * 1024.
> >>
> >> So, it trims between 2048 and 102400. The problem is here. FS layout is
> >> actually, 2048 and (102400 + 2048). I.e. actually userland has to do
> >>
> >> trim(2048, 102400 + 2048)
> >>
> >> to specify whole. How to know 2048?
> >
> > You do not need to know anything in userspace. If you want to trim the
> > whole filesystem you just do trim(0, ULLONG_MAX) - which is what fstrim
> > does when you do not specify range. And you just skip the filesystem
> > metadata obviously, regardless if they are at the beginning of the
> > filesystem or in the middle. Just do whatever you need to do within your
> > filesystem.
> >
> > What we do in ext4 is, that we convert length and start passed in struct
> > fstrim_range into filesystem block units and then get the last
> > allocation group and block offset within that group (we do the same for
> > the start block) and we try to discard free block ranges in from staring
> > block to the last block.
> >
> > It is really not a rocket science and since every filesystem is
> > different and has different internal data structures it is up to you how
> > to do this. And if you shift a block or two, it really does not matter
> > as much since user-land does not know about how the filesystem block are
> > laid out anyway, nor user land knows which are free and which are not.
> >
> > I agree that the interface is a little bit fuzzy, but that is mainly
> > because it is intended to be filesystem independent and we do have a lot
> > of various filesystems, so I wanted it to be as flexibile as it should,
> > hence the start, len in Bytes.
> >
> > Hope it helped.
>
> No. If you want to trim whole with some chunk like 1GB and periodically
> (IIRC in xfstest), what do? We have to trim until ULLONG_MAX for each
> 1GB?
>
> Thanks.
>

What ? No, of course not. As I said, just go through 1G worth of filesystem
blocks skipping metadata. However we do have a special case when we
adjust start and len according to the first data block (which is only
the case of 1024B blocksize).

if (start < first_data_blk) {
len -= first_data_blk - start;
start = first_data_blk;
}

Which means that we just skip the first block (or whatever first data
block is). And this is the same as skipping metadata.

-Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/