[RFC] kmemcheck: TODO for stack tracking

From: Vegard Nossum
Date: Fri Nov 28 2008 - 06:17:08 EST


Hi,

Here's a plan for how to do stack tracking with kmemcheck. It is not
entirely trivial, but as far as I can see, it SHOULD be possible.
Please let me know if you can spot any fallacies or other problems.
I've probably missed something...

/*
* TODO for stack tracking in kmemcheck:
*
* 1. Make kernel run at CPL = 1
*
* This includes (I guess) changing the various privilege levels in most
* system descriptors and descriptor tables, and probably the IOPL. Are there
* any CPU features which always require CPL = 0 to work? Paging requires no
* change, as the U/S flag distinguishes between CPL = 0, 1, 2 and CPL = 3
* only.
*
* 2. Modify TSS to use separate stacks for CPL = 0 and CPL = 1
* 3. Install a Call Gate for Page Faults in the GDT with DPL = 0
* 4. Change IDT entry for #PF to point to Call Gate in GDT
*
* Now when a #PF occurs in kernel mode, CPU will look up the IDT entry for
* #PF. It points to our Call Gate in the GDT, which has a different privilege
* level, so the CPU will look up the new stack to use in the TSS. In the new
* stack, SS, ESP, CS, and EIP are saved. Note: page_fault() will have to take
* care of handling the extra SS/ESP parameters. End of note. Observe that the
* old stack has not been touched by the CPU at all (this would lead to a #DF,
* Double Fault, which is irrecoverable). Observe also that none of the
* interrupted task's registers have been modified. Now the CPU transfers
* control to page_fault(), which must save all registers, etc. as usual.
*
* do_page_fault() must NOT be allowed to enable interrupts, otherwise we
* could take interrupts that would use the new stack. If the interrupt
* handler takes another page fault, the CPU will already be in CPL = 0 and no
* stack switch will occur!
*
* I think we need to make the kernel switch stacks on ALL interrupts. When
* the CPU is interrupted, it will attempt to push CS/EIP on the current
* stack. If the PTE of the current stack is non-present, a Page Fault will be
* generated (not a Double Fault!). However, we have no way to tell if the #PF
* was generated by an interrupt.
*
* 5. Implement support for PUSHA/POPA instruction handling in kmemcheck. No
* extra support will be needed for IRET, as interrupts must not be allowed
* to occur when the stack is located in a non-present page.
*
* Note that it is possible to track POPF/IRET instructions (even though they
* modify EFLAGS and the Trap Flag), because the CPU does the right thing and
* raises the Debug Exception based on the previous setting of TF.
*
* 6. The kernel stack tracer would need to be modified to understand stack
* changes/boundaries.
*/


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/