Re: Kernel-Messages translation

Kurt Huwig (kurt@huwig.de)
Mon, 16 Jun 1997 00:44:14 +0200 (CEST)


> about: Kernel messages translation project.
>
> --- Wanted: Someone to maintain the code and translation tables.
> --- Offered: The code to "DO IT".
> Wanted: Maintainer.

Did I miss something before my first posting? Was there a talk about this
subject going on?

Nevertheless, my suggestions:

There are 3 kinds of kernel-messages:

1. printk( "Foo bar string" );
- most are like this
- simple string replacement

2. printf( "%d\n", value );
- few are like this
- no translation needed

3. printf( "Foo %s bar string", foo_bar_string );
- fewer are like this
- translation may be difficult

If you have any solutions, please show it on these 3 examples, so my
stupid mind understands it. Also, remember the case where the English
version is the same for different values and the foreign version is
different or the other way round. Example:

"The [floppy|screen|modem] is defective"

In English, all versions are equal. In German, its:

"[Die|Der|Das] [Floppy|Bildschirm|Modem] ist defekt"

I guess you cannot handle this with simple table lookup. You need code for
each of this (certainly rare) cases. I will refer to this problem as the
'plural-s'.

I remember of 3 suggestions how to translate the messages:

1. (my posted solution) A hook in 'printk()' allows the replacement of the
given string. Replacement is done at run-time via magic.

Advantage: no change to printk()-statements needed
module is possible, but not recommended (most errors
happen at boot time
just small patch to current kernel
plural-s possible
Disadvantage: every message exists twice, so the kernel grows by some
100kB

2. Small program that replaces the strings in the source-code. It is easy
to find the 'printk()' statements and relative easy to just replace
valid C strings and not comments.

Advantage: no change to printk()-statements needed
kernel is same size as english
Disadvantage: no further patches possible
no plural-s possible

3. All 'printk()' statements are rewritten to take the strings from a
table. This could happen at compile-time via a macro or at run-time via
a function call. A macro could be written to tell the filename of the
sourcecode:

#define PRINTK( x ) translating_printk( __FILE__, x )

Each string needs an identifier and an entry in a table.

Advantage: Language is selectable at compiletime (or maybe runtime)
kernel is little bigger than english
Disadvantage: Complete rewrite of all 'printk()'-statements (automatic)
programmers need to write 2 statements for each
'printk()': 1. define a value for the message
2. insert the message in a table
no plural-s possible

IMHO, the 3rd solution is not suitable. I prefer (certainly) the first,
because I thought about the other 2 before I started coding :-).
A compromise may be a combination of 1 & 2: You have a database of strings
and code. Then you can generate

- C-code for solution 1
- patch-files or replacement tables for solution 2
- combinations of these, so replacement tables for the 'normal' strings
and C-code for the plural-s.

I guess some have not seen the change I've done to the current kernel.
Change to 'printk.c':

+static unsigned int (*translator_proc)(const char *, char *, unsigned
int) = 0;
+static char translator_buf[1024];

- i = vsprintf(buf + 3, fmt, args); /* hopefully i < sizeof(buf)-4
*/
+ if (translator_proc)
+ hashvalue = (*translator_proc)(fmt, translator_buf, 0);
+ else {
+ strcpy(translator_buf, fmt);
+ hashvalue = 0;
+ }
+ i = vsprintf(buf + 3, translator_buf, args);
+ if (hashvalue) { /* translator_proc is a valid pointer */
+ (*translator_proc)(buf + 3, translator_buf, hashvalue);
+ strcpy(buf + 3, translator_buf);
+ i = strlen(buf + 3);
+ }
+ /* hopefully i < sizeof(buf)-4 */

+/*
+ * The translator calls this routine to register the translator procedure
+ * with printk(). proc == NULL, disables the translator. The old translator
+ * is returned.
+ */
+unsigned int(* register_translator(unsigned int (*proc)
+ (const char *, char *, unsigned int)))
+ (const char *, char *, unsigned int)
+{
+ unsigned int (*old_proc)(const char *, char *, unsigned int) =
+ translator_proc;
+ translator_proc = proc;
+ return old_proc;
}

plus a small one to export the name of the function, so it can be used by
modules. I made a small C-prog, that extracts the printk()-statements and
generated code suitable for 'translator'. This is an example of the
output with my translation done (the hash-value is generated
automagically):

case 0xd41e5f7d: /* Console: %ld point font, %ld scans\n */
trans = "Konsole: %ld Punkt Schrift, %ld Zeilen\n";
break;

Now an example for a message with refeeding:

case 0x2feca65a: /* Console: %s %s %ldx%ld, %d virtual console%s
(max %d)\n */
if(!hashvalue) {
trans = "Konsole: %s %s %ldx%ld, %d virtuelle
Konsole%s (max %d)\n";
ret = sum;
} else {
const char *p = english;
do
*(translation++) = *p;
while(*(p++) != ' ');
*translation = '\0';
if(*p == 'c') /* color */
strct(translation, "farbiges");
else
strct(translation, "schwarz-weisses");
while(*(++p) != ' ') ;
strct(translation, p);
}
break;

The first branch of the if-statement is called first time, the second
after the 'sprintf()'.

Ok, enough for today, please comment on this.

Kurt

------------------------------------------------------------
| yes, it runs | Designed for | Microsoft | intel |
| with Netware | Windows 95 | Windows compliant | inside |
------------------------------------------------------------
If you still have problems reading this signature,
get Linux and a REAL cpu!
My eMail address has changed --> kurt@huwig.de