[RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax

From: Aditya Srivastava
Date: Wed Apr 14 2021 - 15:25:45 EST


Currently kernel-doc does not identify some cases of probable kernel
doc comments, for e.g. pointer used as declaration type for identifier,
space separated identifier, etc.

Some example of these cases in files can be:
i)" * journal_t * jbd2_journal_init_dev() - creates and initialises a journal structure"
in fs/jbd2/journal.c

ii) "* dget, dget_dlock - get a reference to a dentry" in
include/linux/dcache.h

iii) " * DEFINE_SEQLOCK(sl) - Define a statically allocated seqlock_t"
in include/linux/seqlock.h

Also improve identification for non-kerneldoc comments. For e.g.,

i) " * The following functions allow us to read data using a swap map"
in kernel/power/swap.c does follow the kernel-doc like syntax, but the
content inside does not adheres to the expected format.

Improve parsing by adding support for these probable attempts to write
kernel-doc comment.

Suggested-by: Jonathan Corbet <corbet@xxxxxxx>
Link: https://lore.kernel.org/lkml/87mtujktl2.fsf@xxxxxxxxxxxx
Signed-off-by: Aditya Srivastava <yashsri421@xxxxxxxxx>
---
scripts/kernel-doc | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 888913528185..37665aa41e6b 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -2110,17 +2110,25 @@ sub process_name($$) {
} elsif (/$doc_decl/o) {
$identifier = $1;
my $is_kernel_comment = 0;
- if (/^\s*\*\s*([\w\s]+?)(\(\))?\s*([-:].*)?$/) {
+ my $decl_start = qr{\s*\*};
+ my $fn_type = qr{\w+\s*\*\s*}; # i.e. pointer declaration type, foo * bar() - desc
+ my $parenthesis = qr{\(\w*\)};
+ my $decl_end = qr{[-:].*};
+ if (/^$decl_start\s*([\w\s]+?)$parenthesis?\s*$decl_end?$/) {
$identifier = $1;
- $decl_type = 'function';
- $identifier =~ s/^define\s+//;
- $is_kernel_comment = 1;
}
if ($identifier =~ m/^(struct|union|enum|typedef)\b\s*(\S*)/) {
$decl_type = $1;
$identifier = $2;
$is_kernel_comment = 1;
}
+ elsif (/^$decl_start\s*$fn_type?(\w+)\s*$parenthesis?\s*$decl_end?$/ || # i.e. foo()
+ /^$decl_start\s*$fn_type?(\w+.*)$parenthesis?\s*$decl_end$/) { # i.e. static void foo() - description; or misspelt identifier
+ $identifier = $1;
+ $decl_type = 'function';
+ $identifier =~ s/^define\s+//;
+ $is_kernel_comment = 1;
+ }
$identifier =~ s/\s+$//;

$state = STATE_BODY;
--
2.17.1