Andi Kleen wrote:Hidetoshi Seto wrote:Without disabling, what can we do on MCE with no bank?Nothing, but is it really worth adding a special case?
If question were:
- is it really worth to support this special environment,
"MCE-capable but no MCE banks" ?
then I'd like to say no.
So I suggested to disable MCE on this uncertain environment.
Or we will end up adding more codes for special cases...
I found that do_machine_check() does nothing if banks==0 ... it is betterIMHO yes. In this case the system must be very confused and panic is the
to let system to panic with "Machine check from unknown source"?
best you can do. Otherwise it won't do anything interesting anyways.
Agreed, but this is also a special case.
Not depending on the real number of banks, confused system could fail to
get the value from memory... Humm, in theory MCE handler must be
implemented carefully, but I bet the confused value will not be always 0,
... is it worth to do?
See the recent patches from David Rientjes in the same original thread.Like SRAT? I could not catch the meaning ... For example?Hum, I suppose the line for CPU 0 was slightly different from others,Can be made INFO yes, but I would prefer not removing them
because SHD means "this bank is shared bank and controlled by other".
Maybe:
CPU 0 MCA banks CMCI:0 CMCI:1 CMCI:2 CMCI:3 CMCI:5 ... CMCI:21
But I agree that we could some work for this messages...
Is it better to change the message level to debug from info?
from the dmesg for now.
Perhaps they could be also compressed a bit like SRAT.
I found it, thanks.
So I suppose your idea is like:
CPU 0 MCA banks CMCI:{0-3,5-9,12-21} POLL:{4,10,11}
CPU 1 MCA banks SHD:{0,1,6-9,12-21} CMCI:{2,3,5} POLL:{4,10,11}
right?
IMHO the format I suggested is better to read, as far as banks is
not so big number.
CPU 0 MCA banks map : CCCC PCCC CCPP CCCC CCCC CC
CPU 1 MCA banks map : ssCC PCss ssPP ssss ssss ss
Thanks,
H.Seto