Re: [RFC 0/2] Add a new translation tool scripts/trslt.py

From: Federico Vaga
Date: Tue Apr 13 2021 - 19:27:38 EST


Hi,

Yes, you are touching a good point where things can be improved. I admit that I
did not have a look at the code yet, if not very quickly. Perhaps I'm missing
somethin. However, let me give you my two cents based on what I usually do.

I do not like the idea of adding tags to the file and having tools to modify it.
I would prefer to keep the text as clean as possible.

Instead, what can be done without touching manipulating the text file is to do
something like this:

# Take the commit ID of the last time a document has translated
LAST_TRANS=$(git log -n 1 --oneline Documentation/translations/<lang>/<path-to-file> | cut -d " " -f 1)

# Take the history of the same file in the main Documentation tree
git log --oneline $LAST_TRANS..doc/docs-next Documentation/<path-to-file>

This will give you the list of commits that changed <path-to-file>, and that
probably need to be translated. The problem of this approach is that by the time
you submit a translation, other people may change the very same files. The
correctness of this approach depends on patch order in docs-next, and this can't
be guaranteed.

So, instead of reling on LAST_DIR, I rely on a special git branch that acts as
marker. But this works only for me and not for other translator of the same
languages, so you can get troubles also in this case.

What we can actually do is to exploit the git commit message to store the tag
you mentioned. Hence, we can get the last Id with something like this:

LAST_ID=$(git log -n 1 Documentation/translations/<lang>/<path-to-file> | grep -E "Translated-on-top-of: commit [0-9a-f]{12}")

The ID we store in the tag does not need to be the commit ID of the last change
to <path-to-file>, but just the commit on which you were when you did the
translation. This because it will simplify the management of this tag when
translating multiple files/patches in a single patch (to avoid to spam the
mailing list with dozens of small patches).

On Mon, Apr 12, 2021 at 03:04:03PM +0800, Wu XiangCheng wrote:
Hi all,

This set of patches aim to add a new translation tool - trslt.py, which
can control the transltions version corresponding to source files.

For a long time, kernel documentation translations lacks a way to control the
version corresponding to the source files. If you translate a file and then
someone updates the source file, there will be a problem. It's hard to know
which version the existing translation corresponds to, and even harder to sync
them.

The common way now is to check the date, but this is not exactly accurate,
especially for documents that are often updated. And some translators write
corresponding commit ID in the commit log for reference, it is a good way,
but still a little troublesome.

Thus, the purpose of ``trslt.py`` is to add a new annotating tag to the file
to indicate corresponding version of the source file::

.. translation_origin_commit: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The script will automatically copy file and generate tag when creating new
translation, and give update suggestions based on those tags when updating
translations.

More details please read doc in [Patch 2/2].

Still need working:
- improve verbose mode
- test on more python 3.x version
- only support linux now, need test on Mac OS, nonsupport Windows
due to '\'

Any suggestion is welcome!

Thanks!

Wu XiangCheng (2):
scripts: Add new translation tool trslt.py
docs: doc-guide: Add document for scripts/trslt.py

Documentation/doc-guide/index.rst | 1 +
Documentation/doc-guide/trslt.rst | 233 ++++++++++++++++++++++++++
scripts/trslt.py | 267 ++++++++++++++++++++++++++++++
3 files changed, 501 insertions(+)
create mode 100644 Documentation/doc-guide/trslt.rst
create mode 100755 scripts/trslt.py

--
2.20.1