Re: ioremap()/PCI sickness in 2.4.18-rc2 (FIXED ALMOST)

From: Jeff V. Merkey (jmerkey@vger.timpanogas.org)
Date: Wed Feb 20 2002 - 16:54:49 EST


The following information is submitted regarding this problem:

I have corrected the prefetch allocation problem by reducing
the prefetch address space size from 512 MB down to 32 MB. The
failing code is in get_vm_area(). This bug effectively reduces
the ability of Linux over other OS's to run scalable SCI based
applications with large numbers of nodes if there are large numbers
of nodes in a cluster and these nodes have more than 1 GB of memory
in each system.

My RAID/FS application does not need a huge prefetch address space so I
can get around this problem easily, but some of the SCI applications do,
and this problem will relegate Linux to a back seat status for supercomputer
applications that use this technology if Linux in unable to map larger
regions of memory. I would propose that the maintainer of
vmalloc.c look at using 48 bit PTE entries or some other solution
as a way to alloc larger virtual address frames when the system has
a lot of physical memory. It's seems pretty lame to me for a machine
with 2 GB of physical memory not to have at lest 256 MB of address space
left over for address mapping.

Offending code attached. Please advise as to what a proposed solution
could be if any is possible with this problem. I am happy to adjust
the Dolphin IRM driver behavior to accomodate Linux but I think some
larger clusters (i.e. > 1000 nodes) may not work with Linux is there's
not enough address space to map remotely across the cluster.

Jeff

struct vm_struct * get_vm_area(unsigned long size, unsigned long flags)
{
        unsigned long addr;
        struct vm_struct **p, *tmp, *area;

        area = (struct vm_struct *) kmalloc(sizeof(*area), GFP_KERNEL);
        if (!area)
                return NULL;
        size += PAGE_SIZE;
        addr = VMALLOC_START;
        write_lock(&vmlist_lock);
        for (p = &vmlist; (tmp = *p) ; p = &tmp->next) {
           
===============> we barf here since the size + addr wraps

                if ((size + addr) < addr)
                        goto out;

                if (size + addr <= (unsigned long) tmp->addr)
                        break;
                addr = tmp->size + (unsigned long) tmp->addr;
                if (addr > VMALLOC_END-size)
                        goto out;
        }
        area->flags = flags;
        area->addr = (void *)addr;
        area->size = size;
        area->next = *p;
        *p = area;
        write_unlock(&vmlist_lock);
        return area;

out:
        write_unlock(&vmlist_lock);
        kfree(area);
        return NULL;
}

On Wed, Feb 20, 2002 at 11:00:04AM -0700, Jeff V. Merkey wrote:
>
>
> David,
>
> Someone had a thought that perhaps the Serverworks chipset is mapping
> addresses above the 4GB boundry. Any thoughts on how to get around
> this problem?
>
> Jeff
>
>
> On Wed, Feb 20, 2002 at 09:30:34AM -0800, David S. Miller wrote:
> > From: Jeff Garzik <jgarzik@mandrakesoft.com>
> > Date: Wed, 20 Feb 2002 12:26:12 -0500
> >
> > type abuse aside, and alpha bugs aside, this looks ok... what is the
> > value of as->msize?
> >
> > Jeff and Jeff, the problem is one of two things:
> >
> > 1) when you have ~2GB of memory the vmalloc pool is very small
> > and this it the same place ioremap allocations come from
> >
> > 2) the BIOS or Linus is not assigning resources of the device
> > properly, or it simple can't because the available PCI MEM space
> > with this much memory is too small
> >
> > I note that one of the resources of the card is 16MB or so.
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Feb 23 2002 - 21:00:27 EST