Re: [PATCH 0/3] Fix dt-validate issues on qemu dtbdumps due to dt-bindings

From: Conor.Dooley
Date: Mon Aug 15 2022 - 17:18:56 EST


Any takers on trashing my regex? Otherwise I'll just submit
a v2 with the regex and it can be shat on there instead :)

On 09/08/2022 19:36, Conor Dooley wrote:
> On 09/08/2022 15:14, Rob Herring wrote:
>> On Mon, Aug 08, 2022 at 10:01:11PM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
>>> On 08/08/2022 22:34, Jessica Clarke wrote:
>>>> On Fri, Aug 05, 2022 at 05:28:42PM +0100, Conor Dooley wrote:
>>>>> From: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
>>>>> The final patch adds some new ISA strings
>>>>> which needs scruitiny from someone with more knowledge about what ISA
>>>>> extension strings should be reported in a dt than I have.
>>>>
>>>> Listing every possible ISA string supported by the Linux kernel really
>>>> is not going to scale...
>>
>> How does the kernel scale? (No need to answer)
>>
>>> Yeah, totally correct there. Case for adding a regex I suppose, but I
>>> am not sure how to go about handling the multi-letter extensions or
>>> if parsing them is required from a binding compliance point of view.
>>> Hoping for some input from Palmer really.
>>
>> Yeah, looks like a regex pattern is needed.
>
> I started pottering away at this but I have arrived at:
> rv64imaf?d?c?h?(_z[imafdqcbvkh]([a-z])*)*$
>
> I suspect that before "h?" there should be more single letter
> extensions added for completeness sake. So then it'd bloat out to:
> rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$
>
> I checked a couple different "bad" isa strings against it and
> nothing went up in flames but my regex skills are far from great
> so I'm sure there's better ways to represent this.
>
> Anyways, this pattern is based on my understanding that:
> - the single letter order is fixed & we don't care about things that
> can't even do "ima"
> - the multi letter extensions are all in a "_z<foo>" format where the
> first letter of <foo> is a valid single letter extension
> - we don't care about the e extension from an OS PoV (this could be a
> very flawed take...)
> - after the first two chars, the extension name could be an english
> word (ifencei anyone?) so it's not worth restricting the charset
> - that attempting to validate the contents of the multiletter extensions
> with dt-validate beyond the formatting is a futile, massively verbose
> or unwieldy exercise at best
>
> Some or all of those assumptions could be very very wrong so if {someone,
> anyone} wants to correct me - feel ***more*** than free..
>
> Thanks,
> Conor.
>
> patch would then look like:
>
> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
> index d632ac76532e..1e54e7746190 100644
> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> @@ -74,9 +74,7 @@ properties:
> insensitive, letters in the riscv,isa string must be all
> lowercase to simplify parsing.
> $ref: "/schemas/types.yaml#/definitions/string"
> - enum:
> - - rv64imac
> - - rv64imafdc
> + pattern: rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$
>
> # RISC-V requires 'timebase-frequency' in /cpus, so disallow it here
> timebase-frequency: false