Kernel config revisited (was: Re: _My_ turn for a 2.1 wishlist)

Werner Almesberger (almesber@lrc.di.epfl.ch)
Fri, 2 Aug 1996 21:52:37 +0200 (MET DST)


Kai Schulte wrote:
> Why not write a utility program? Hunting through the whole source tree
> once to find all the #ifdefs and check for corresponding #includes and then
> doing a second pass, finding a decent place for insertions and doing the
> actual edit may take quite some time.

I've just written two little Perl scripts to "sanitize" the dependency
graph. The first one walks the source tree and determines who uses what
configuration options. (It also reads the config tree to filter out unused
options.) Then it clusters the options in a way that each configuration
change will affect at most a configurable number of unrelated files and
writes them to files in include/config (the file names are automatically
chosen, based on which file is the most frequent user of those options).
Then it walks the source tree a second time, now adding the new includes
after #include <linux/config.h> and throwing out #include <linux/autoconf.h>

The second script updates the files in config/ from autoconf.h (takes only
a fraction of a second on a reasonably fast machine).

The clustering algorithm uses a few simple heuristics to minimize the
number of clusters it generates (one could also consider different goals,
e.g. to put an upper bound on the number of config options covered by a
config file), but I didn't try too hard to make it fast. (The whole
splitting process takes a bit more than one minute on a P5 at 120 MHz.)
Also the set operations on associative arrays are admittedly ugly ...

I've made a few small tests and the scripts seem to work reasonably well
(i.e. the kernel compiled :-), but please don't run them on some source
tree you don't have a backup of. The splitter script might also be useful
for analyzing properties like "dead" config options, because it gathers
such information.

I've attached the two Perl scripts. (popcfg.pl and splitcfg.pl)

I probably won't have time to play a lot more with these toys, so if
somebody wants to turn them into something useful, please feel
encouraged ...

- Werner

---------------------------------- popcfg.pl ----------------------------------

#!/usr/bin/perl

$CDIR = "include/config";
$AUTOCONF = "include/linux/autoconf.h";

open(FILE,"$AUTOCONF") || die "open $AUTOCONF: $!";
while (<FILE>) { $opt{$1} = $_ if /\bCONFIG_(\w+)/; }
close(FILE);

opendir(DIR,$CDIR) || die "open $CDIR: $!";
for (readdir(DIR)) {
local ($old) = $CDIR."/".$_;
next unless -f $old;
local ($new) = $CDIR."/".$_."~";
open(IN,"$old") || die "open $old: $!";
open(OUT,">$new") || die "create $new: $!";
undef @opts;
while (<IN>) {
print OUT || die "write $new: $!";
last if /\*\//;
next if /Auto/;
chop;
for (split(" ")) {
next if $_ eq "" || $_ eq "*";
push(@opts,$_);
}
}
while (<IN>) { $was{$1} = $_ if /\bCONFIG_(\w+)/; }
close(IN);
print OUT "\n" || die "write $new: $!";
$changed = 0;
for (@opts) {
next unless defined $opt{$_};
print OUT $opt{$_} || die "write $new: $!";
delete $opt{$_} if $opt{$_} =~ /#undef/;
delete $was{$_} if $was{$_} =~ /#undef/;
$changed = 1 if $opt{$_} ne $was{$_};
}
close(OUT) || die "close $new: $!";
if ($changed) {
rename($new,$old) || die "rename $new to $old: $!";
}
else {
unlink($new);
}
}

--------------------------------- splitcfg.pl ---------------------------------

#!/usr/bin/perl

# This script splits include/linux/autoconf.h into smaller header files in
# $CDIR. These header files contain sets of configuration variables such that
# each change to a configuration variable doesn't affect more than $LIMIT
# unrelated .c files. After creating the header files, the affected source
# files are modified to use the new scheme.
#
# Unused configuration variables are automatically discarded and dependencies
# introduced by header files (e.g. #include <linux/fs.h>) are considered.
#
# Note: this script must be started from the top-level directory of the kernel
# source tree. Run make whateverconfig and scripts/popcfg.pl
# afterwards to copy the settings to the new config files.

$LIMIT = 10;
$CDIR = "include/config";
$ICDIR = "config";

$clr = "\e[0J\r";
$clr = "\n";

sub get_arch
{
open(FILE,"Makefile") || die "can't open Makefile: $!";
while (<FILE>) {
if (/^ARCH = (\S+)/) {
$ARCH = $1;
return;
}
}
die "can't determine architecture";
}

sub scan_config
{
local ($path) = @_;

print STDERR "Reading $path$clr";
local (@cmd,@src);
open(FILE,$path) || die "can't open $path: $!";
while (<FILE>) {
chop;
while (/\\$/) {
chop;
$_ .= <FILE>;
chop;
}
s/#.*//;
s/^\s*//;
s/\s+/ /g;
next if $_ eq "";
while (/'([^']*) ([^']*)'/) { $_ = $`."'".$1."~".$2."'".$'; }
while (/"([^"]*) ([^"]*)"/) { $_ = $`.'"'.$1."~".$2.'"'.$'; }
@cmd = split(" ");
if ($cmd[0] eq "source") { push(@src,$cmd[1]); }
elsif ($cmd[0] eq "bool" || $cmd[0] eq "tristate" ||
$cmd[0] eq "dep_tristate" || $cmd[0] eq "int" || $cmd[0] eq "hex") {
$valid{$cmd[2]} = 1;
}
elsif ($cmd[0] eq "define_bool") {
$valid{$cmd[1]} = 1;
}
elsif ($cmd[0] eq "choice") {
$cmd[2] =~ s/^\s*//;
$cmd[2] =~ s/\s+/ /g;
$cmd[2] =~ s/\s*$//;
local ($second) = 1;
for (split("~",$cmd[2])) {
if ($second) { $valid{$_} = 1; }
$second = !$second;
}
}
elsif ($cmd[0] ne "if" && $cmd[0] ne "fi" && $cmd[0] ne "else" &&
$cmd[0] ne "mainmenu" && $cmd[0] ne "mainmenu_name" &&
$cmd[0] ne "mainmenu_option" && $cmd[0] ne "endmenu" &&
$cmd[0] ne "comment" && $cmd[0] ne "\$MAKE" && $cmd[0] !~ /=/ &&
$cmd[0] ne "echo") {
die "don't grok $cmd[0] (in $path)";
}
}
close(FILE);
for (@src) { &scan_config($_); }
}

sub scandir
{
local ($dir) = @_;

opendir(DIR,"$dir.") || die "can't open $dir: $!";
print STDERR "Scanning $dir$clr";
local (@names) = readdir(DIR);
closedir(DIR);
for (@names) {
next if $_ eq "." || $_ eq "..";
local ($path) = $dir.$_;
local (@subdirs);
stat($path) || die "can't stat $path: $!";
push(@subdirs,$path."/") if -d _;
if (-f _ && /\.[chsS]$/) {
local (%t_inc,%t_opt);
open(FILE,$path) || die "can't open $path: $!";
while (<FILE>) {
if (/^\s*#\s*include\s+([<"])([^<"]+)[>"]/) {
$t_inc{$1 eq "<" ? "include/$2" : $2} = 1;
# paranoia - usually ppl include only once
}
elsif (/\bCONFIG_(\w+)/) {
$t_opt{$1} = 1 if $valid{"CONFIG_".$1};
}
}
close(FILE);
next unless keys %t_opt || $t_inc{"include/linux/autoconf.h"};
delete $t_inc{"include/linux/autoconf.h"};
push(@files,$path);
$inc{$path} = join(" ",keys %t_inc) if keys %t_inc;
$opt{$path} = join(" ",keys %t_opt);
$file_opt{$path} = $opt{$path};
}
for (@subdirs) { &scandir($_); }
}
}

sub flatten
{
local ($busy,$progress) = 1;

while ($busy) {
$busy = $progress = 0;
file: for $path (keys %inc) {
for (split(" ",$inc{$path})) {
if (defined $inc{$_}) {
$busy = 1;
next file;
}
}
# okay, this one is fully resolved
$progress = 1;
local (%t_opt);
for ($opt{$path}) { $t_opt{$_} = 1; }
for (split(" ",$inc{$path})) {
for ($opt{$_}) { $t_opt{$_} = 1; }
}
$opt{$path} = join(" ",keys %t_opt);
delete $inc{$path};
}
die "include loop (".join(" ",keys %inc).")" unless $progress;
}
}

sub reverse
{
for $path (keys %opt) {
for (split(" ",$opt{$path})) {
if ($path =~ /\.c$/) {
if (defined $map{$_}) {
$map{$_} .= " ".$path;
}
else {
$map{$_} = $path;
}
}
}
}
for (values %map) {
$_ = join(" ",sort(split(" ",$_)));
}
}

sub cluster
{
local (@i) =
sort { split(" ",$map{$b}) <=> split(" ",$map{$a}) } keys(%map);
for (keys %map) { $cluster{$_} = $_; }
for $a (0..$#i) {
next unless defined($cluster{$i[$a]});
print STDERR "Aggregating $i[$a] ...$clr";
local (%super);
for (split(" ",$map{$i[$a]})) { $super{$_} = 1; }
$badness = 0;
while (1) {
local ($best);
local ($best_diff) = $LIMIT+1;
for $b ($a+1..$#i) {
next unless defined($cluster{$i[$b]});
local ($diff,$bad) = (0,$badness);
local (%this);
for (split(" ",$map{$i[$b]})) {
$this{$_} = 1;
$bad ++ unless $super{$_};
}
for (keys %super) { $diff++ unless $this{$_}; }
$diff = $bad if $bad > $diff;
if ($diff < $best_diff) {
$best_diff = $diff;
$best = $b;
}
}
last if $best_diff > $LIMIT;
$cluster{$i[$a]} .= " ".$i[$best];
delete $cluster{$i[$best]};
for (split(" ",$map{$i[$best]})) {
if (!$super{$_}) {
$super{$_} = 1;
$badness++;
}
}
}
}
}

sub name
{
local (%inuse);

for $cluster (keys %cluster) {
local (%count);
for $option (split(" ",$cluster{$cluster})) {
for (split(" ",$map{$option})) {
$count{$_}++;
}
}
for (sort { $count{$b} <=> $count{$a} } keys %count) {
local ($n) = $_;
$n =~ s/^.*\///;
$n =~ s/\..*//;
if (!$inuse{$n}) {
$name{$cluster} = $n;
$inuse{$n} = 1;
last;
}
}
$name{$cluster} = $cluster unless defined $name{$cluster};
for (split(" ",$cluster{$cluster})) {
$mapped{$_} = $name{$cluster}.".h";
}
}
}

sub create
{
for (keys %cluster) {
local ($n) = $CDIR."/".$name{$_}.".h";
print STDERR "Creating $n$clr";
open(FILE,">$n") || die "can't create $n: $!";
print FILE "/*\n" || die "write $n: $!";
print FILE " * Automatically generated: don't edit\n" ||
die "write $n: $!";
print FILE " *\n" || die "write $n: $!";
local ($line) = " *";
for (sort split(" ",$cluster{$_})) {
if (length($line)+length($_) > 75) {
print FILE "$line\n" || die "write $n: $!";
$line = " *";
}
$line .= " ".$_;
}
print FILE "$line\n" || die "write $n: $!";
print FILE " */\n" || die "write $n: $!";
close(FILE) || die "can't close $n: $!";
}
}

sub edit
{
for $path (@files) {
print STDERR "Editing $path$clr";
open(IN,"$path") || die "can't open $path: $!";
local ($new) = "$path~";
# Uncomment the next line if you're afraid of file name truncation
# die "$new exists" if -f $new;
open(OUT,">$new") || die "can't create $new: $!";
while (<IN>) {
if (/^\s*#\s*include\s+([<"])([^<"]+)[>"]/) {
if ($2 ne "linux/autoconf.h") {
print OUT || die "write $new: $!";
}
next unless $2 eq "linux/autoconf.h" || $2 eq "linux/config.h";
local (%map);
undef %map;
for (split(" ",$file_opt{$path})) {
$map{$mapped{$_}} = 1;
}
for (keys %map) {
print OUT "#include <$ICDIR/$_>\n" || die "write $new: $!";
}
}
else {
print OUT || die "write $new: $!";
}
}
close(OUT) || die "close $new: $!";
close(IN);
rename($new,$path) || die "renaming $new to $path: $!";
}
}

&get_arch;
&scan_config("arch/$ARCH/config.in");
&scandir("");
print STDERR "Flattening ...$clr";
&flatten;
print STDERR "Building reversed map ...$clr";
&reverse;
&cluster;
&name;
mkdir($CDIR,0755);
&create;
&edit;
print STDERR "$clr";

-- 
  _________________________________________________________________________
 / Werner Almesberger, DI-LRC,EPFL,CH   werner.almesberger@lrc.di.epfl.ch /
/_IN_R_133__Tel_+41_21_693_6621__Fax_+41_21_693_6610_____________________/