Re: Avoiding *mandatory* overcommit...

From: Steve Thompson (stevet@sendon.net)
Date: Mon Apr 03 2000 - 22:39:50 EST


Quoting Linda Walsh (law@sgi.com):
> Marco Colombo wrote:
> Just because a bug is documented still doesn't mean it is not a bug. Oh yeah,
> it's not a bug...it's a *feature*. A proper feature operates under the principle
> of "least surprise". Most users will expect malloc to allocate a real, usable
> pointer. If the pointer is not usable, they will consider it broken. The fact
> that some arcane design describes how this really isn't a bug is irrelevant.
> The user did not *ask* for such behavior. Most likely they want the behavior of
> getting back a *usable* pointer. Please -- don't try to argue that users really
> don't care if malloc returns a bogus pointer. It won't fly.

I don't want to come down on a particular side of this debate as I do not
consider myself knowledgeable enough to know exactly what the correct solution
is. However, I am writing a program which I expect to be able to use most of
the address space available to my process if necessary. Considering the OOM
killer's behavior, I _can_ write a reliable program but only at the expense
of the rest of the system.

As far as I can see, I need to do something like the following to ensure that
I get what I ask for:

#ifdef OOM_KILLER
static void avl_oom_handler(int sig)
{
    longjmp(avl_jmp_buf, sig);
}
#endif

struct avl_vm *avl_vm_create_area(size_t hint)
{
#ifdef OOM_KILLER
#ifdef __linux__
    struct sigaction new, old;
#endif
#endif
    struct avl_vm *a=NULL, *p;
    size_t len=hint * sizeof(struct avl_node);

    if(!(a=sys_malloc(sizeof(struct avl_vm))))
        return(NULL);

    a->vm=a->fl=NULL;

    if(!len)
        len=AVL_VM_UNIT;

    if((a->vm=mmap(void, len, (PROT_READ|PROT_WRITE), (MAP_PRIVATE|MAP_ANONYMOUS), -1, 0) == (void *)-1))
        goto avl_vm_create_area_abt;

    if(!a->vm)
        goto avl_vm_create_area_abt;

    a->len=len;

    if(!(a->vm_fl=sys_malloc(sizeof(struct avl_vm_free_list))))
        goto avl_vm_create_area_abt;

    a->fl->start=a->vm;
    a->fl->end=((void *)r + (len - sizeof(struct avl_node)));
    a->fl->next=NULL;

#ifdef OOM_KILLER
#ifdef __linux__
    /*
        Survival of the fittest
    */
    sigaction(SIG_BUS, NULL, &old);

    if(setjmp(avl_jmp_buf)) {
        sigaction(SIGBUS, &old, NULL); /* Just in case */
        errno=ENOMEM;
        goto avl_vm_create_area_abt;
    }

    new.sa_handler=avl_oom_handler;
    sigemptyset(&new.sa_mask);
    new.sa_flags=0;
    sigaction(SIGBUS, &new, NULL);

    p=a->fl->end;
    while(p > r)
        p--->right=0;

    sigaction(SIGBUS, &old, NULL);
#endif
#endif

    return(r);
}

Which, incidentally, is untested (uncompiled, really), but I expect it to work
as designed. I know that this scheme will stop my process from being killed
in situations where overcommit is enabled if it is my process which results in
a severe OOM condition. However, if another process provides the "straw" which
causes the OOM killer to start reaping, my scheme will not work.

As things stand, I cannot think of a way to make my possibly large process
more robust in this situation. While I could use mmap() on a file to get gobs
of VM, this still doesn't address the issue that (and correct me if I am
wrong) I cannot get more than 2GB of VM for a given process.

I find this somewhat odd considering that I know that I can configure and use
8GB of swap among different processes. As an aside, 8GB seems an odd figure
for a 32-bit architecture, but I suppose there is a logical reason that the
kernel's overall effective system address space appears to be 33 or 34 bits.
To really fantasize, I wonder if there isn't some sort of MMU magic which
could be used to make a processes' usable address space 4GB (since the system
wide VM pointers don't necessarily need to use the 8 least-significant bits to
track 4096 pages.). But, I realize that this has nothing to do with the
problem of VM allocation to user processes.

So, to get back to the issues, does the kernel need to be fixed wrt VM
allocation (or the behaviour of the OOM killer) or is there something which I
can do as a user to make my memory allocations more reliable?

Regards,

Steve

-- 
Does anyone know of a good recipe for bean dip?

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Apr 07 2000 - 21:00:11 EST