Re: [PATCH -v2] scipts/tags.sh: Add custom sort order

From: Masahiro Yamada
Date: Wed Sep 02 2020 - 12:12:02 EST


On Thu, Sep 3, 2020 at 12:58 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
>
> On Fri, Aug 7, 2020 at 2:28 AM <peterz@xxxxxxxxxxxxx> wrote:
> >
> >
> > One long standing annoyance I have with using vim-tags is that our tags
> > file is not properly sorted. That is, the sorting exhuberant Ctags does
> > is only on the tag itself.
> >
> > The problem with that is that, for example, the tag 'mutex' appears a
> > mere 505 times, 492 of those are structure members. However it is _far_
> > more likely that someone wants the struct definition when looking for
> > the mutex tag than any of those members. However, due to the nature of
> > the sorting, the struct definition will not be first.
> >
> > So add a script that does a custom sort of the tags file, taking the tag
> > kind into account.
> >
> > The kind ordering is roughly: 'type', 'function', 'macro', 'enum', rest.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > ---
> > Changes since v1:
> > - removed the need for tags.unsorted by using a pipe
> >
> > Due to this change 'make tags' is now actually faster than it was before
> > due to less sorting.
> >
> > scripts/sort-tags.awk | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > scripts/tags.sh | 11 +++++--
> > 2 files changed, 87 insertions(+), 3 deletions(-)
> >
> > diff --git a/scripts/sort-tags.awk b/scripts/sort-tags.awk
> > new file mode 100755
> > index 000000000000..1eb50406c9d3
> > --- /dev/null
> > +++ b/scripts/sort-tags.awk
> > @@ -0,0 +1,79 @@
> > +#!/usr/bin/awk -f
> > +
> > +# $ ctags --list-kinds
> > +# C
> > +# c classes
> > +# s structure names
> > +# t typedefs
> > +# g enumeration names
> > +# u union names
> > +# n namespaces
> > +
> > +# f function definitions
> > +# p function prototypes [off]
> > +# d macro definitions
> > +
> > +# e enumerators (values inside an enumeration)
> > +# m class, struct, and union members
> > +# v variable definitions
> > +
> > +# l local variables [off]
> > +# x external and forward variable declarations [off]
> > +
> > +BEGIN {
> > + FS = "\t"
> > +
> > + sort = "LC_ALL=C sort"
> > +
> > + # our sort order for C kinds:
> > + order["c"] = "A"
> > + order["s"] = "B"
> > + order["t"] = "C"
> > + order["g"] = "D"
> > + order["u"] = "E"
> > + order["n"] = "F"
> > + order["f"] = "G"
> > + order["p"] = "H"
> > + order["d"] = "I"
> > + order["e"] = "J"
> > + order["m"] = "K"
> > + order["v"] = "L"
> > + order["l"] = "M"
> > + order["x"] = "N"
> > +}
> > +
> > +# pass through header
> > +/^!_TAG/ {
> > + print $0
> > + next
> > +}
> > +
> > +{
> > + # find 'kinds'
> > + for (i = 1; i <= NF; i++) {
> > + if ($i ~ /;"$/) {
> > + kind = $(i+1)
> > + break;
> > + }
> > + }
> > +
> > + # create sort key
> > + if (order[kind])
> > + key = $1 order[kind];
> > + else
> > + key = $1 "Z";
> > +
> > + # get it sorted
> > + print key "\t" $0 |& sort
> > +}
> > +
> > +END {
> > + close(sort, "to")
> > + while ((sort |& getline) > 0) {
> > + # strip key
> > + sub(/[^[:space:]]*[[:space:]]*/, "")
> > + print $0
> > + }
> > + close(sort)
> > +}
> > +
> > diff --git a/scripts/tags.sh b/scripts/tags.sh
> > index 4e18ae5282a6..51087c3d8b1e 100755
> > --- a/scripts/tags.sh
> > +++ b/scripts/tags.sh
> > @@ -251,8 +251,10 @@ setup_regex()
> >
> > exuberant()
> > {
> > + (
> > +
> > setup_regex exuberant asm c
> > - all_target_sources | xargs $1 -a \
> > + all_target_sources | xargs $1 \
> > -I __initdata,__exitdata,__initconst,__ro_after_init \
> > -I __initdata_memblock \
> > -I __refdata,__attribute,__maybe_unused,__always_unused \
> > @@ -266,12 +268,15 @@ exuberant()
> > -I DEFINE_TRACE,EXPORT_TRACEPOINT_SYMBOL,EXPORT_TRACEPOINT_SYMBOL_GPL \
> > -I static,const \
> > --extra=+fq --c-kinds=+px --fields=+iaS --langmap=c:+.h \
> > + --sort=no -o - \
> > "${regex[@]}"
> >
> > setup_regex exuberant kconfig
> > - all_kconfigs | xargs $1 -a \
> > - --langdef=kconfig --language-force=kconfig "${regex[@]}"
> > + all_kconfigs | xargs $1 \
> > + --langdef=kconfig --language-force=kconfig --sort=no \
> > + -o - "${regex[@]}"
> >
> > + ) | scripts/sort-tags.awk > tags
> > }
> >
> > emacs()
>
>
> Sorry for the long delay.
>
> First, this patch breaks 'make TAGS'
> if 'etags' is a symlink to exuberant ctags.
>
>
> masahiro@oscar:~/ref/linux$ etags --version
> Exuberant Ctags 5.9~svn20110310, Copyright (C) 1996-2009 Darren Hiebert
> Addresses: <dhiebert@xxxxxxxxxxxxxxxxxxxxx>, http://ctags.sourceforge.net
> Optional compiled features: +wildcards, +regex
>
> masahiro@oscar:~/ref/linux$ make TAGS
> GEN TAGS
> etags: Warning: include/linux/seqlock.h:738: null expansion of name pattern "\2"
> sed: can't read TAGS: No such file or directory
> make: *** [Makefile:1820: TAGS] Error 2
>
>
>
>
> The reason is the hard-coded ' > tags',
> and easy to fix.
>
>
>
> But, honestly, I am not super happy about this patch.
>
> Reason 1
> In my understanding, sorting by the tag kind only works
> for ctags. My favorite editor is emacs.
> (Do not get me wrong. I do not intend emacs vs vi war).
> So, I rather do 'make TAGS' instead of 'make tags',
> but this solution would not work for etags because
> etags has a different format.
> So, I'd rather want to see a more general solution.
>
> Reason 2
> We would have more messy code, mixing two files/languages
>
>
>
> When is it useful to tag structure members?
>
> If they are really annoying, why don't we delete them
> instead of moving them to the bottom of the tag file?
>
>
>
> I attached an alternative solution,
> and wrote up my thoughts in the log.
>
> What do you think?
>



Sorry, the commit log of the attachment was wrong.

The correct sentence is:

"OK, [3] clearly explained why 'p' is useful, but turned --c-kinds=-px
into --c-kinds=+px. So, 'x' was also (accidentally?) enabled."



--
Best Regards
Masahiro Yamada