Oh, yes, I just realized that mce_end() released other__memory_failure() handling calls some routines, suchBy the time I call __memory_failure() - the other cpus have
as is_free_buddy_page(), which needs to acquire the spin
lock, zone->lock. How can we guarantee that other CPUs
haven't acquired the lock when receiving #mc broadcast
and entering #mc handlers ?
been released from mce handler - so they are back executing
normal code.
But Chen Gong's earlier comments made me look again at entry_64.SWhy do you plan to switch out of machine check stack while
code - ane I realized that I missed seeing code in the return
path from do_machine_check() that switched from MCE stack to
regular kernel stack before processing TIF_MCE_NOTIFY.
I may go back and re-visit a path that I looked at to change
do_machine_check from "void" return to "unsigned long" and have
it return the address for the "AR" case and "0" otherwise.
Then we could switch out of machine check stack to non-mce
context to call __memory_failure(). When I looked at this
before the entry_64.S path looked plausible. The 32-bit
path looked to be painful (too many macros in entry_32.S)
-Tony
NïïïïïrïïyïïïbïXïïÇvï^ï)Þ{.nï+ïïïï{ïïïïzXïïïïÜ}ïïïÆzï&j:+vïïïïïïïzZ+ïï+zfïïïhïïï~ïïïïiïïïzïïwïïï?ïïïï&ï)ßfïï^jÇyïmïï@Aïaïïï0ïïhïïi