Re: [PATCH] kbuild: avoid split lines in .mod files

From: Sami Tolvanen
Date: Thu Dec 03 2020 - 13:47:18 EST


Hi Masahiro,

On Thu, Dec 3, 2020 at 9:56 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
>
> "xargs echo" is not a safe way to remove line breaks because the input
> may exceed the command line limit and xargs may break it up into
> multiple invocations of echo. This should never happen because
> scripts/gen_autoksyms.sh expects all undefined symbols are placed in
> the second line of .mod files.
>
> One possible way is to replace "xargs echo" with
> "sed ':x;N;$!bx;s/\n/ /g'" or something, but I rewrote the code by
> using awk because it is more readable.
>
> This issue was reported by Sami Tolvanen; in his Clang LTO patch set,
> $(multi-used-m) is no longer an ELF object, but a thin archive that
> contains LLVM bitcode files. llvm-nm prints out symbols for each
> archive member separately, which results a lot of dupications, in some
> places, beyond the system-defined limit.
>
> This problem must be fixed irrespective of LTO, and we must ensure
> zero possibility of having this issue.
>
> Link: https://lkml.org/lkml/2020/12/1/1658
> Reported-by: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx>
> ---
>
> scripts/Makefile.build | 12 ++++--------
> 1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index ae647379b579..4c058f12dd73 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -252,6 +252,9 @@ objtool_dep = $(objtool_obj) \
> ifdef CONFIG_TRIM_UNUSED_KSYMS
> cmd_gen_ksymdeps = \
> $(CONFIG_SHELL) $(srctree)/scripts/gen_ksymdeps.sh $@ >> $(dot-target).cmd
> +
> +# List module undefined symbols
> +undefined_syms = $(NM) $< | $(AWK) '$$1 == "U" { printf("%s%s", x++ ? " " : "", $$2) }';
> endif
>
> define rule_cc_o_c
> @@ -271,13 +274,6 @@ define rule_as_o_S
> $(call cmd,modversions_S)
> endef
>
> -# List module undefined symbols (or empty line if not enabled)
> -ifdef CONFIG_TRIM_UNUSED_KSYMS
> -cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | xargs echo
> -else
> -cmd_undef_syms = echo
> -endif
> -
> # Built-in and composite module parts
> $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
> $(call if_changed_rule,cc_o_c)
> @@ -285,7 +281,7 @@ $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE
>
> cmd_mod = { \
> echo $(if $($*-objs)$($*-y)$($*-m), $(addprefix $(obj)/, $($*-objs) $($*-y) $($*-m)), $(@:.mod=.o)); \
> - $(cmd_undef_syms); \
> + $(undefined_syms) echo; \
> } > $@
>
> $(obj)/%.mod: $(obj)/%.o FORCE

Thanks for the patch! I confirmed that this works with llvm-nm and
bitcode files, but it does still produce plenty of duplicates, even
though they now stay on one line. I'm not sure if the readability of
the .mod file matters though. Please feel free to add:

Reviewed-by: Sami Tolvanen <samitolvanen@xxxxxxxxxx>

Sami