Re: [PATCH] x86/mm/fault: Streamline the fault error_code decoder some more

From: Andy Lutomirski
Date: Thu Dec 06 2018 - 11:39:10 EST





> On Dec 6, 2018, at 7:53 AM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
>
>> On Thu, Dec 06, 2018 at 08:34:09AM +0100, Ingo Molnar wrote:
>> I like your '!' idea, but with a further simplification: how about using
>> '-/+' differentiation and a single character and a fixed-length message.
>>
>> The new output will be lines of:
>>
>> #PF error code: -P -W -U -S -I -K (0x00)
>> ...
>> #PF error code: -P -W +U +S -I -K (0x0c)
>> ...
>> #PF error code: +P +W +U +S +I +K (0x3f)
>>
>> The symbol abbreviations are pretty self-explanatory:
>>
>> P = protection fault (X86_PF_PROT)
>> W = write access (X86_PF_WRITE)
>> U = user-mode access (X86_PF_USER)
>> S = supervisor mode (X86_PF_RSVD)
>> I = instruction fault (X86_PF_INSTR)
>> K = keys fault (X86_PF_PK)
>>
>> Misc notes:
>>
>> - In principle the new text is now short enough to include it in one of
>> the existing output lines, further shortening the oops output - but I
>> havent done that in this patch.
>>
>> - Another question is the ordering of the bits: the symbolic display is
>> actually big endian, while the numeric hexa printout is little endian.
>>
>> I kind of still like it that way, not just because the decoding loop is
>> more natural, but because the bits are actually ordered by importance:
>> the PROT bits is more important than the INSTR or the PK bits - and the
>> more important bits are displayed first.
>
> Hmm, my eyes tend to be drawn to the end of the line, e.g. having PROT
> be the last thing makes it stand out more than being buried in the middle
> of the line. Inverting the ordering between raw and decoded also makes
> it very difficult to correlate the raw value with each bit.
>
>> - Only build-tested the patch and looked at the generated assembly, but
>> it all looks sane enough so will obviously work just fine! ;-)
>
> ...
>
>> /*
>> - * This helper function transforms the #PF error_code bits into
>> - * "[PROT] [USER]" type of descriptive, almost human-readable error strings:
>> + * This maps the somewhat obscure error_code number to symbolic text:
>> + *
>> + * P = protection fault (X86_PF_PROT)
>> + * W = write access (X86_PF_WRITE)
>> + * U = user-mode access (X86_PF_USER)
>> + * S = supervisor mode (X86_PF_RSVD)
>> + * I = instruction fault (X86_PF_INSTR)
>> + * K = keys fault (X86_PF_PK)
>> */
>> -static void err_str_append(unsigned long error_code, char *buf, unsigned long mask, const char *txt)
>> +static const char error_code_chars[] = "PWUSIK";
>> +
>> +/*
>> + * This helper function transforms the #PF error_code bits into " +P -W +U -R -I -K"
>> + * type of descriptive, almost human-readable error strings:
>> + */
>> +static void show_error_code(struct pt_regs *regs, unsigned long error_code)
>
> No need for @regs.
>
>> {
>> - if (error_code & mask) {
>> - if (buf[0])
>> - strcat(buf, " ");
>> - strcat(buf, txt);
>> + unsigned int bit, mask;
>> + char err_txt[6*3+1]; /* Fixed length of 6 bits decoded plus zero at the end */
>
> Assuming the error code bits are contiguous breaks if/when SGX gets added,
> which uses bit 15. Oopsing on an SGX fault should be nigh impossible, but
> it might be nice to be able to reuse show_error_code in other places.
>
> Hardcoding "6" is also a bit painful.
>
>> +
>> + /* We go from the X86_PF_PROT bit to the X86_PF_PK bit: */
>> +
>> + for (bit = 0; bit < 6; bit++) {
>> + unsigned int offset = bit*3;
>> +
>> + err_txt[offset+0] = ' ';
>> +
>> + mask = 1 << bit;
>> + if (error_code & mask)
>> + err_txt[offset+1] = '+';
>> + else
>> + err_txt[offset+1] = '-';
>
> To me, using '!' contrasts better when side-by-side with '+'.
>
>> +
>> + err_txt[offset+2] = error_code_chars[bit];
>> }
>> +
>> + /* Close the string: */
>> + err_txt[sizeof(err_txt)-1] = 0;
>> +
>> + pr_alert("#PF error code: %s (%02lx)\n", err_txt, error_code);
>
> The changelog example has a leading "0x" on the error code. That being
> said, I actually like it without the "0x".
>
> How about printing the raw value before the colon? Having it at the end
> makes it look like extra noise. And for me, seeing the raw code first
> (reading left to right) cue's my brain that it's about to decode some
> bits.
>
> SGX will also break the two digit printing. And for whatever reason four
> digits makes me think "this is an error code!". IIRC the vectoring ucode
> makes a lot of assumptions about the error code being at most 16 bits, so
> in theory four digits is all we'll ever need.
>
> E.g.
>
> [ 0.144247] #PF error code: +P -W -U -S -I -K (01)
> [ 0.144411] #PF error code: +P +W -U -S -I -K (03)
> [ 0.144826] #PF error code: +P +W +U -S -I -K (07)
> [ 0.145252] #PF error code: +P -W +U -S -I +K (25)
> [ 0.145706] #PF error code: -P +W -U -S -I -K (02)
> [ 0.146111] #PF error code: -P -W +U -S -I -K (04)
> [ 0.146521] #PF error code: -P +W +U -S -I -K (06)
> [ 0.146934] #PF error code: -P -W +U -S +I -K (14)
> [ 0.147348] #PF error code: +P -W -U -S +I -K (11)
> [ 0.147767] #PF error code: -P -W -U -S -I -K (00)

>
> vs. (with SGX added as 'G' for testing purposes)
>
> [ 0.158849] #PF error code(0001): +P !W !U !S !I !K !G
> [ 0.159292] #PF error code(0003): +P +W !U !S !I !K !G
> [ 0.159742] #PF error code(0007): +P +W +U !S !I !K !G
> [ 0.160190] #PF error code(0025): +P !W +U !S !I +K !G
> [ 0.160638] #PF error code(0002): !P +W !U !S !I !K !G
> [ 0.161087] #PF error code(0004): !P !W +U !S !I !K !G
> [ 0.161538] #PF error code(0006): !P +W +U !S !I !K !G
> [ 0.161992] #PF error code(0014): !P !W +U !S +I !K !G
> [ 0.162450] #PF error code(0011): +P !W !U !S +I !K !G
> [ 0.162667] #PF error code(8001): +P !W !U !S !I !K +G
> [ 0.162667] #PF error code(8003): +P +W !U !S !I !K +G
> [ 0.162667] #PF error code(8007): +P +W +U !S !I !K +G
> [ 0.162667] #PF error code(8025): +P !W +U !S !I +K +G
> [ 0.162667] #PF error code(8002): !P +W !U !S !I !K +G
> [ 0.162667] #PF error code(8004): !P !W +U !S !I !K +G
> [ 0.162667] #PF error code(8006): !P +W +U !S !I !K +G
> [ 0.162667] #PF error code(8014): !P !W +U !S +I !K +G
> [ 0.162667] #PF error code(8011): +P !W !U !S +I !K +G
> [ 0.162667] #PF error code(0000): !P !W !U !S !I !K !G
>

Please donât. The whole reason I added the decoding was to make it easy to read without a cheat sheet. This is incomprehensible without reference to the code, and Iâm familiar with it to begin with.

How about:

#PF error code: 0001 [PROT read kernel]

#PF error code: 0001 [PROT WRITE kernel]

#PF error code: 0001 [PROT read kernel]

#PF error code: 8011 [PROT INSTR kernel SGX]

This has no noise from unset bits except that we add lowercase âreadâ or âkernelâ as appropriate. Even âkernelâ seems barely necessary.