Re: 2.4.19pre1aa1

From: Daniel Phillips (phillips@bonn-fries.net)
Date: Tue Mar 05 2002 - 19:09:20 EST


On March 5, 2002 01:41 pm, Rik van Riel wrote:
> On Tue, 5 Mar 2002 arjan@fenrus.demon.nl wrote:
> > In article <20020305005215.U20606@dualathlon.random> you wrote:
> >
> > > I don't see how per-zone lru lists are related to the kswapd deadlock.
> > > as soon as the ZONE_DMA will be filled with filedescriptors or with
> > > pagetables (or whatever non pageable/shrinkable kernel datastructure you
> > > prefer) kswapd will go mad without classzone, period.
> >
> > So does it with class zone on a scsi system....
>
> Furthermore, there is another problem which is present in
> both 2.4 vanilla, -aa and -rmap.
>
> Suppose that (1) we are low on memory in ZONE_NORMAL and
> (2) we have enough free memory in ZONE_HIGHMEM and (3) the
> memory in ZONE_NORMAL is for a large part taken by buffer
> heads belonging to pages in ZONE_HIGHMEM.
>
> In that case, none of the VMs will bother freeing the buffer
> heads associated with the highmem pages and kswapd will have
> to work hard trying to free something else in ZONE_NORMAL.
>
> Now before you say this is a strange theoretical situation,
> I've seen it here when using highmem emulation. Low memory
> was limited to 30 MB (16 MB ZONE_DMA, 14 MB ZONE_NORMAL)
> and the rest of the machine was HIGHMEM. Buffer heads were
> taking up 8 MB of low memory, dcache and inode cache were a
> good second with 2 MB and 5 MB respectively.
>
>
> How to efficiently fix this case ? I wouldn't know right now...
> However, I guess we might want to come up with a fix because it's
> a quite embarassing scenario ;)

There's the short term fix - hack the vm - and the long term fix:
get rid of buffers. A buffers are does three jobs at the moment:

  1) cache the physical block number
  2) io handle for a file block
  3) data handle for a file block, including locking

The physical block number could be moved either into the struct
page - which desireable since it wastes space for pages that don't
have physical blocks - or my preferred solution, move it into the
page cache radix tree.

For (2) we have a whole flock of solutions on the way. I guess
bio does the job quite nicely as Andrew Morton demonstrated last
week.

For (3), my idea is to generalize the size of the object referred
to by struct page so that it can match the filesystem block size.
This is still in the research stage, and there are a few issues I'm
looking at, but the more I look the more practical it seems. How
nice it would be to get rid of the page->buffers->page tangle, for
one thing.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 07 2002 - 21:00:51 EST