Re: Kernel virtual memory?

David S. Miller (davem@jenolan.rutgers.edu)
Thu, 7 Aug 1997 23:32:29 -0400


Date: Fri, 8 Aug 1997 02:29:10 +0100 (BST)
From: Mark Hemment <markhe@nextd.demon.co.uk>

A major problem with threading the Memory Management code (not
just with moving pages), is bits in a pte which can be changed
asychronously by the h/w. For most archs, these attributes are
'referenced' and 'dirty'. Any kernel locking cannot avoid these
races [as they are changed without entering kernel-space].

Really? No locking can avoid these problems, surely you jest.

I was able to fix both these problems in 15 minutes of work, and it
changed very little of the kernel and did not require cross-calls in
%95 of the cases, even under high load. Worked like this in
pseudocode:

swap_out() {
int allow_others = 0;
again:
for_each_candidate_task(tsk) {
int others_dirty;

spin_lock(&scheduler_lock);
others_dirty = (tsk->mm->cpu_vm_mask !=
(1UL << smp_processor_id()));
if(!allow_others && others_dirty)
goto next;
swap_out_task(tsk, others_dirty);
continue;
next:
spin_unlock(&scheduler_lock);
}
if(no_progress_made) {
allow_others = 1;
goto again;
}
}

swap_out_task(tsk, others_dirty) {
if(others_dirty)
smp_capture();
if(good_idea_to_swap(pte)) {
flush_cache...();
set_pte(...);
flush_tlb...();
smp_release();
spin_unlock(&scheduler_lock);
free_up_the_page(); /* could sleep */
return;
}
out:
if(others_dirty)
smp_release();
spin_unlock(&scheduler_lock);
}

All straight forward, maybe 15 or 16 lines of changes to the generic
code. Don't rewrite what ain't broke.

Later,
David "Sparc" Miller
davem@caip.rutgers.edu