Re: Internationalizing Linux

Bruce Korb (korb@datadesign.com)
Tue, 08 Dec 1998 08:09:20 -0800


Riley Williams wrote:
> Although English is my native tongue, and the only one I'm anywhere
> near fluent in, I'd like to see this as well. As well as the bonus of
> users seeing messages in their own language, there is the bonus that
> the kernel will probably shrink due to all the messages moving out of
> it, if it's done correctly.

I actually consider that the primary benefit.

> Personally, I'd see this as the ideal job for a userland daemon, but
> there is one slight problem with that: How to handle the messages
> before the said daemon starts. Any suggestions, Alan?

His response was interesting. Looks like it would work.
My thought was that a temporary event log would be made available
even before kmem_alloc so the events could be emitted almost
on the first line of boot code. Eventually (actually, timewise, not
very much later), the console display daemon starts up and
formats the backlog of messages. There may (ok, would) be some
messages that would need to bypass this. (Like, "insert system floppy",
for example.) And panics, of course.

> However, if this is to be of any use, it would need to be distributed
> with the kernel and supported by all of the subsystems...

Yes. It needs to be a painless as it can possibly be made, too.
That is why I am promoting the use of embedded comments.
Nobody has to go wandering off to another file to set up
the macro. Everything a developer need do can be done
right there in the code.

> > To my mind the constant numbering and also correct handling of
> > positional data are the killer issues.
>
> Neither should be a problem if done correctly.

Another benefit of automated comment extraction: everything
is forced to be consistent and the numbering can be coerced into
being unchanging. Each event would have a unique number
within their group of events. So, it would really consist of
a group number-event number pair.

> Here's how I would see it implemented in pseudo-C:

[[knlmsgd daemon code omitted]]

> As regards the language definition files, I see those as standard text
> files, with the relevant code as the first 'word' on each line, and
> the associated text following it, separated by whitespace.

The technique I am proposing is almost that. In a file named
something like "subsystem-event.def", you might have:

altfmt[SUBSYS_EVT_EVENT_NAME] = "alternate formatting string";

and the extraction tool makes the appropriate substitution
when it encounters the "event_name" event for the "subsys" subsystem.

> The codes could be just about anything, providing they were retained
> and used in every definition file. Probably a useful format would be
> to have them begin with a mnemonic specifying the kernel subsystem
> they relate to, and follow this by an error number within that
> subsystem.

My plan is to use group (subsystem) names and event names.
Avoid numbers. Too hard to remember. Let the computer remember
and translate them.

> Given the above specification, it would be possible to write a tool
> that took as parameters two language specifications, and as input on
> stdin one or more messages in the first language, and produce on
> stdout a translation of those messages in the second language, and it
> shouldnae care what the languages in question are...

I already have the tool that does all this work.
All that is needed are the appropriate templates
to be driven by these comment strings. The #define
macros are emitted for the driver code, the
format string tables are emitted for the console display
daemon and, if desired, throttling tables can be
generated as well. This would allow for dynamic control
of event emission at run time.

> Comments, anybody?

Global strings in the kernel for yes/no/on/off/etc.
was a cute idea. I liked it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/