Re: [RFC v3] scripts: kernel-doc: fix typedef support for struct/union parsing

From: Aditya
Date: Sun Mar 07 2021 - 02:37:04 EST


On 6/3/21 8:50 pm, Matthew Wilcox wrote:
> On Sat, Mar 06, 2021 at 01:18:38PM +0530, Aditya wrote:
>> On 6/3/21 11:55 am, Lukas Bulwahn wrote:
>>> I agree. That might be a suitable clean-up to keep the code for
>>> functions and struct/union parsing similar in style/spirit.
>>>
>>> Aditya, would you like to create a patch for that?
>>
>> Sure Lukas.
>> I have a doubt though, Can't we use a single expression separated by
>> "|" here, instead of multiple lines? i.e.,
>>
>> $x =~
>> s/__packed|__aligned|____cacheline_aligned_in_smp|____cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)\s*//;
>>
>>
>> Probably we could do something similar for dump_function, i.e.,
>> - $prototype =~ s/^static +//;
>> - $prototype =~ s/^extern +//;
>> - $prototype =~ s/^asmlinkage +//;
>> - $prototype =~ s/^inline +//;
>> - $prototype =~ s/^__inline__ +//;
>> - $prototype =~ s/^__inline +//;
>> - $prototype =~ s/^__always_inline +//;
>> - $prototype =~ s/^noinline +//;
>>
>> + $prototype =~
>> s/^(?:static|extern|asmlinkage|__?inline__?|__always_inline|noinline) +//;
>> And so on for other regexps.
>>
>> What do you think?
>
> I think there's a tradeoff between speed / compactness and readability.
> As someone who doesn't know perl particularly well, I can look at the
> series of lines and say "Oh, it's stripping out these unwanted things".
> Your one line, while undoubtedly more efficient, is considerably less
> easy to understand.
>
> Maybe there's another way to do it that's more efficient while not
> sacrificing the readability?
>
> Also, would your suggestion work for 'static inline void foo(void)'?
> I think it needs to remove multiple occurrences of the things in the
> regex.

Ah, I get it.

But maybe that's what the ?: on the beginning is for?
>

No, "?:" is just to use this regex for matching, without capturing.
So, the regex will just remove any of those 'starting' occurrences,
consequently, "static inline" occurrence will probably not be removed.
I think the reason for using multiple lines for substitution in
dump_function is for the same reason, ie, subsequent substitution.

But for dump_struct, it is probably not desired, i.e., subsequent
substitution/removal. Also, for the same reason, using it with
dump_struct may cause unwelcomed discrepancies, or cause the user to
understand that here multiple substitutions are required, but is not so.
So, I think that we should use a single expression substitution for
dump_struct, at best. But am not sure if that is required, as
currently also we are not capturing this part of the regex, as well
as, matching only at certain position of the definition expression.

But I am curious to what others think about this.
Will be happy to work on this patch :)

Thanks
Aditya