Re: external tool to remove embedded filenames

From: Bhaskar Chowdhury
Date: Fri Oct 02 2020 - 10:50:05 EST


On 11:47 Thu 01 Oct 2020, Joe Perches wrote:
It's rather unnecessary for files to contain their
path/filename in source code comments.

Here's a trivial little script that can remove
embedded filenames in c90 style comments from files.

This requires git.

It does the following types of removals:

remove individual lines like /* filename */ completely
remove filename from /* filename -- comment */, leave /* comment */
remove filename and any trailing ' *\n' from /* filename, leave /*
remove filename from /* filename, leave /*
remove filename from continuation ' * filename -- comment' leave ' * comment'
remove filename and any trailing ' *\n' from continuation ' * filename\n *\n'

It seems to work well enough.

It does not handle c99 comments.
No // filename variants are removed.

Running it on today's -next gives:

$ perl remove_embedded_filenames.pl
$ git diff --shortstat
2310 files changed, 354 insertions(+), 4239 deletions(-)

It's also possible to give any filename or path
as an argument to the script

For instance:

$ perl remove_embedded_filenames.pl drivers/net


#!/usr/bin/perl -w

# script to remove * <filename> comments;
# use: perl remove_embedded_filenames.pl <paths|files>
# e.g.: perl remove_embedded_filenames.pl drivers/net/ethernet/intel

use strict;

my $P = $0;
my $modified = 0;
my $quiet = 0;

sub expand_tabs {
my ($str) = @_;

my $res = '';
my $n = 0;
for my $c (split(//, $str)) {
if ($c eq "\t") {
$res .= ' ';
$n++;
for (; ($n % 8) != 0; $n++) {
$res .= ' ';
}
next;
}
$res .= $c;
$n++;
}

return $res;
}

my $args = join(" ", @ARGV);
my $output = `git ls-files -- $args`;
my @files = split("\n", $output);

foreach my $file (@files) {
my $f;
my $cvt = 0;
my $text;

# read the file

next if ((-d $file));

open($f, '<', $file)
or die "$P: Can't open $file for read\n";
$text = do { local($/) ; <$f> };
close($f);

next if ($text eq "");

# Remove the embedded filenames

# remove individual lines like /* filename */ completely
$cvt += $text =~ s@/\*[ \t]+(?:linux\/)?\Q$file\E[ \t]*\*/[ \t]*\n@@g;
pos($text) = 0;
# remove filenamee from /* filename -- comment */, leave /* comment */
$cvt += $text =~ s@/\*([ \t]+)(?:linux\/)?\Q$file\E[ \t]*[:-]+[ \t]*@/*$1@g;
pos($text) = 0;
# remove filename and any trailing ' *\n' from /* filename, leave /*
$cvt += $text =~ s@/\*([ \t]+)(?:linux\/)?\Q$file\E[ \t]*\n([ \t]*\*[ \t]*\n)*(?:[ \t]*\*)?@/*@g;
pos($text) = 0;
# remove filename from /* filename, leave /*
$cvt += $text =~ s@/\*([ \t]+)(?:linux\/)?\Q$file\E[ \t]*\n@/*@g;
pos($text) = 0;
# remove filename from continuation ' * filename -- comment'
# leave ' * comment'
$cvt += $text =~ s/([ \t]+)\*([ \t]*)(?:linux\/)?\Q$file\E[ \t]*[:-]+[ \t]*/$1*$2/g;
pos($text) = 0;
# remove filename and any trailing ' *\n' from
# continuation ' * filename\n *\n'
$cvt += $text =~ s/([ \t]*)\*([ \t]*)(?:linux\/)?\Q$file\E[ \t]*\n([ \t]*\*[ \t]*\n)*//g;
pos($text) = 0;

# write the file if something was changed

if ($cvt > 0) {
$modified = 1;
print("$file\n");
open($f, '>', $file)
or die "$P: Can't open $file for write\n";
print $f $text;
close($f);
}
}

if ($modified && !$quiet) {
print <<EOT;

Warning: these changes may not be correct.

These changes should be carefully reviewed manually and not combined with
any functional changes.

Compile, build and test your changes.

You should understand and be responsible for all object changes.

Make sure you read Documentation/SubmittingPatches before sending
any changes to reviewers, maintainers or mailing lists.
EOT
}
Joe,

Suggestion.... please take those damn EOT lines out of it ..absolutely not
required...or did you put for your own purpose?? As I believe it not the final
product. Anyway, it would be good if those not there.

Yup, I do like the "individual option" stuff ...so, you can only mess around
single thing than the whole lot.

~Bhaskar

Attachment: signature.asc
Description: PGP signature