Re: [PATCH] docs: license-rules.txt: cover SPDX headers on Python scripts

From: Mauro Carvalho Chehab
Date: Thu Sep 05 2019 - 06:50:35 EST


Em Thu, 5 Sep 2019 11:27:03 +0200
Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> escreveu:

> On Thu, Sep 05, 2019 at 06:23:13AM -0300, Mauro Carvalho Chehab wrote:
> > The author of the license-rules.rst file wanted to be very restrict
> > with regards to the location of the SPDX header. It says that
> > the SPDX header "shall be added at the first possible line in
> > a file which can contain a comment". Not happy with this already
> > restrictive requiement, it goes further:
> >
> > "For the majority of files this is the first line, except for
> > scripts", opening an exception to have the SPDX header at the
> > second line, if the first line starts with "#!".
> >
> > Well, it turns that this is too restrictive for Python scripts,
> > and may cause regressions if this would be enforced.
> >
> > As mentioned on:
> > https://stackoverflow.com/questions/728891/correct-way-to-define-python-source-code-encoding
> >
> > Python's PEP-263 [1] dictates that an script that needs to default to
> > UTF-8 encoding has to follow this rule:
> >
> > 'Python will default to ASCII as standard encoding if no other
> > encoding hints are given.
> >
> > To define a source code encoding, a magic comment must be placed
> > into the source files either as first or second line in the file'
> >
> > And:
> > 'More precisely, the first or second line must match the following
> > regular expression:
> >
> > ^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)'
> >
> > [1] https://www.python.org/dev/peps/pep-0263/
> >
> > If a script has both "#!" and the charset encoding line, we can't place
> > a SPDX tag without either violating license-rules.rst or breaking the
> > script by making it crash with non-ASCII characters.
> >
> > So, add a sort notice saying that, for Python scripts, the SPDX
> > header may be up to the third line, in order to cover the case
> > where both "#!" and "# .*coding.*UTF-8" lines are found.
> >
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@xxxxxxxxxx>
> > ---
> > Documentation/process/license-rules.rst | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/process/license-rules.rst b/Documentation/process/license-rules.rst
> > index 2ef44ada3f11..5d23e3498b1c 100644
> > --- a/Documentation/process/license-rules.rst
> > +++ b/Documentation/process/license-rules.rst
> > @@ -64,9 +64,12 @@ License identifier syntax
> > possible line in a file which can contain a comment. For the majority
> > of files this is the first line, except for scripts which require the
> > '#!PATH_TO_INTERPRETER' in the first line. For those scripts the SPDX
> > - identifier goes into the second line.
> > + identifier goes into the second line\ [1]_.
> >
> > -|
> > +.. [1] Please notice that Python scripts may also need an encoding rule
> > + as defined on PEP-263, which should be defined either at the first
> > + or the second line. So, for such scripts, the SPDX identifier may
> > + go up to the third line.
> >
> > 2. Style:
> >
>
> If you are going to do this, can you also fix up scripts/spdxcheck.py to
> properly catch this,

Hmm... it defaults to analyze the first 15 lines:

ap.add_argument('-m', '--maxlines', type=int, default=15,
help='Maximum number of lines to scan in a file. Default 15')

So, I guess it won't require any changes.

> as well as fixing up the location of the spdx tag
> line in the file itself?

Good point. I'll write a patch fixing the SPDX location at the three
files where the coding location is at the wrong place.

>
> thanks,
>
> greg k-h



Thanks,
Mauro