RE: [PATCH v17 05/10] fs/ntfs3: Add attrib operations

From: Konstantin Komarov
Date: Mon Jan 18 2021 - 07:07:57 EST


From: Kari Argillander <kari.argillander@xxxxxxxxx>
Sent: Monday, January 4, 2021 3:26 AM
> To: Konstantin Komarov <almaz.alexandrovich@xxxxxxxxxxxxxxxxxxxx>
> Cc: linux-fsdevel@xxxxxxxxxxxxxxx; viro@xxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; pali@xxxxxxxxxx; dsterba@xxxxxxx;
> aaptel@xxxxxxxx; willy@xxxxxxxxxxxxx; rdunlap@xxxxxxxxxxxxx; joe@xxxxxxxxxxx; mark@xxxxxxxxxxxxx; nborisov@xxxxxxxx;
> linux-ntfs-dev@xxxxxxxxxxxxxxxxxxxxx; anton@xxxxxxxxxx; dan.carpenter@xxxxxxxxxx; hch@xxxxxx; ebiggers@xxxxxxxxxx;
> andy.lavr@xxxxxxxxx
> Subject: Re: [PATCH v17 05/10] fs/ntfs3: Add attrib operations
>
> On Thu, Dec 31, 2020 at 06:23:56PM +0300, Konstantin Komarov wrote:
> > This adds attrib operations
> >
> > Signed-off-by: Konstantin Komarov <almaz.alexandrovich@xxxxxxxxxxxxxxxxxxxx>
> > ---
> > fs/ntfs3/attrib.c | 2081 +++++++++++++++++++++++++++++++++++++++++++
> > fs/ntfs3/attrlist.c | 463 ++++++++++
> > fs/ntfs3/xattr.c | 1072 ++++++++++++++++++++++
> > 3 files changed, 3616 insertions(+)
> > create mode 100644 fs/ntfs3/attrib.c
> > create mode 100644 fs/ntfs3/attrlist.c
> > create mode 100644 fs/ntfs3/xattr.c
> >
> > diff --git a/fs/ntfs3/attrlist.c b/fs/ntfs3/attrlist.c
>
> > +/*
> > + * al_find_ex
> > + *
> > + * finds the first le in the list which matches type, name and vcn
> > + * Returns NULL if not found
> > + */
> > +struct ATTR_LIST_ENTRY *al_find_ex(struct ntfs_inode *ni,
> > + struct ATTR_LIST_ENTRY *le,
> > + enum ATTR_TYPE type, const __le16 *name,
> > + u8 name_len, const CLST *vcn)
> > +{
> > + struct ATTR_LIST_ENTRY *ret = NULL;
> > + u32 type_in = le32_to_cpu(type);
> > +
> > + while ((le = al_enumerate(ni, le))) {
> > + u64 le_vcn;
> > + int diff;
> > +
> > + /* List entries are sorted by type, name and vcn */
>
> Isn't name sorted with upcase sort.
>

Hi! You are correct. Will be fixed in v18.

> > + diff = le32_to_cpu(le->type) - type_in;
> > + if (diff < 0)
> > + continue;
> > +
> > + if (diff > 0)
> > + return ret;
> > +
> > + if (le->name_len != name_len)
> > + continue;
> > +
> > + if (name_len &&
> > + memcmp(le_name(le), name, name_len * sizeof(short)))
> > + continue;
>
> So does this compare name correctly? So it is caller responsible that
> name is up_cased? Or does it even mater.
>
> And this will check every name in right type. Why not use name_cmp and
> then we know if we over. It might be because performance. But maybe
> we can check that like every 10 iteration or something.
>

Now name check will be only for list_entry with vcn==0.

> > + if (!vcn)
> > + return le;
> > +
> > + le_vcn = le64_to_cpu(le->vcn);
> > + if (*vcn == le_vcn)
> > + return le;
> > +
> > + if (*vcn < le_vcn)
> > + return ret;
> > +
> > + ret = le;
>
> So we still have wrong vcn at this point. And we save that so we can
> return it. What happens if we will not found right one. Atlest function
> comment say that we should return NULL if we do not found matching entry.
>

Can't agree here.
E.g. given list_entry: 0, 67, 89, 110, 137.
The function will return 89 as the similar thread stores the info about vcn==100.

> > + }
> > +
> > + return ret;
> > +}
> > +
> > +/*
> > + * al_find_le_to_insert
> > + *
> > + * finds the first list entry which matches type, name and vcn
>
> This comment seems wrong? This seems to find insert point for new
> le.
>

Thanks for this. Fixed.

> > + * Returns NULL if not found
> > + */
> > +static struct ATTR_LIST_ENTRY *
> > +al_find_le_to_insert(struct ntfs_inode *ni, enum ATTR_TYPE type,
> > + const __le16 *name, u8 name_len, const CLST *vcn)
> > +{
> > + struct ATTR_LIST_ENTRY *le = NULL, *prev;
> > + u32 type_in = le32_to_cpu(type);
> > + int diff;
> > +
> > + /* List entries are sorted by type, name, vcn */
> > +next:
> > + le = al_enumerate(ni, prev = le);
> > + if (!le)
> > + goto out;
> > + diff = le32_to_cpu(le->type) - type_in;
> > + if (diff < 0)
> > + goto next;
> > + if (diff > 0)
> > + goto out;
> > +
> > + if (ntfs_cmp_names(name, name_len, le_name(le), le->name_len, NULL) > 0)
> > + goto next;
>
> Why not go out if compare is < 0. In my mind this will totally ignore
> name and next just find right vcn (or we come next ID) and call it a day.
>

Will be fixed in v18 as well.

> NAME VCN
> [AAB] [2] <- Looks insert point for this.
>
> [AAA] [1]
> [AAB] [1]
> <- This is right point.
> [AAC] [1]
> <- But we tell that insert point is here.
> [AAD] [2]
>
> I might be totally wrong but please tell me what I'm missing.
>
> > + if (!vcn || *vcn > le64_to_cpu(le->vcn))
> > + goto next;
> > +
> > +out:
> > + if (!le)
> > + le = prev ? Add2Ptr(prev, le16_to_cpu(prev->size)) :
> > + ni->attr_list.le;
> > +
> > + return le;
> > +}
>
> There seems to be lot of linear list search. Do you think it will be
> benefital to code binary or jump search for them? Just asking for
> intrest. Might be that it will not benefit at all but just thinking
> here.
>
> I might try to do that in some point if someone see point of that.

It's nice idea, we will appreciate such patch. But please keep in mind that
binary search will outperform linear dramatically only for heavily fragmented files.
By the way, the same idea of replacing linear with binary search is implemented in
index.c (please refer to NTFS3_INDEX_BINARY_SEARCH).

Also, your notes on attrlist.c led us to refactor this file. Thanks once again!