Re: process creation time increases linearly with shmem

From: Nick Piggin
Date: Thu Aug 25 2005 - 09:29:30 EST


Ray Fucillo wrote:
Nick Piggin wrote:

fork() can be changed so as not to set up page tables for
MAP_SHARED mappings. I think that has other tradeoffs like
initially causing several unavoidable faults reading
libraries and program text.

What kind of application are you using?


The application is a database system called Caché. We allocate a large shared memory segment for database cache, which in a large production environment may realistically be 1+GB on 32-bit platforms and much larger on 64-bit. At these sizes fork() is taking hundreds of miliseconds, which can become a noticeable bottleneck for us. This performance characteristic seems to be unique to Linux vs other Unix implementations.



As Andi said, hugepages might be a very nice feature for you guys
to look into and might potentially give a performance increase with
reduced TLB pressure, not only your immediate fork problem.

Anyway, the attached patch is something you could try testing. If
you do so, then I would be very interested to see performance results.

Thanks,
Nick

--
SUSE Labs, Novell Inc.

Index: linux-2.6/kernel/fork.c
===================================================================
--- linux-2.6.orig/kernel/fork.c 2005-08-04 15:24:36.000000000 +1000
+++ linux-2.6/kernel/fork.c 2005-08-26 00:20:50.000000000 +1000
@@ -256,7 +256,6 @@ static inline int dup_mmap(struct mm_str
* Note that, exceptionally, here the vma is inserted
* without holding mm->mmap_sem.
*/
- spin_lock(&mm->page_table_lock);
*pprev = tmp;
pprev = &tmp->vm_next;

@@ -265,8 +264,11 @@ static inline int dup_mmap(struct mm_str
rb_parent = &tmp->vm_rb;

mm->map_count++;
- retval = copy_page_range(mm, current->mm, tmp);
- spin_unlock(&mm->page_table_lock);
+ if (!(file && (tmp->vm_flags & VM_SHARED))) {
+ spin_lock(&mm->page_table_lock);
+ retval = copy_page_range(mm, current->mm, tmp);
+ spin_unlock(&mm->page_table_lock);
+ }

if (tmp->vm_ops && tmp->vm_ops->open)
tmp->vm_ops->open(tmp);