Re: CONFIG_BIGMEM and rawio

Andrea Arcangeli (andrea@suse.de)
Thu, 2 Sep 1999 12:26:03 +0200 (CEST)


On Thu, 2 Sep 1999, Manfred wrote:

>Im reading the BIGMEM codes, and it seems that BIGMEM and rawio are
>incompatible.
>Is that correct?

The thing is not black and white. It can be compatible or incompatible, it
only depends about the internal details of the blockdevice driver and on
which memory you are going to do raw-io. It only depends on the source and
on the destination of the raw-IO. If both the source and the destination
are OK, then you can do raw-io fine also with BIGMEM enabled. The current
code is just completly stable, you can't harm the kernel trying to do
raw-io on the wrong place. If the source and the destination are not
compatible, then you'll get an -EIO from read(2) or write(2).

For example if you know for sure that the blockdevice where you want to
write in raw-io mode does DMA driven I/O, then you can just now do raw-io
from or to bigmemory _fine_ on _such_ blockdevice using 2.3.16. The only
thing you have to do to experiment this, is to comment out these lines in
buffer.c of 2.3.16:

if (map && PageBIGMEM(map)) {
err = -EIO;
goto error;
}

then you'll be able to do raw-io everywhere as you can do with bigmem
disabled.

WARNING: after removing the above lines you must make sure to not do
raw-io from or to bigmemory with blockdevices that are _not_ using DMA or
you'll get in troubles. So if you are unsure don't remove such lines,
_never_.

And you can _safely_ use raw-io in a _stock_ 2.3.16 with bigmem enabled
too; you won't risk to crash the kernel but you'll get a graceful -EIO in
the _worst_ case. You can get -EIO only trying to do raw-io on _shm_ (as
big-databases) or _anonymous_ userspace memory. If you want to do raw-io
from/to framebuffer memory, that will work just fine also with bigmem
enabled.

>Are there any plans to change that?

Right now the above check is obviously right and safe but it's strict.
Maybe it worth to add a per-blockdevice bitflag to allow raw-io on bigmem
pages with devices that uses DMA (such devices won't need any internal
change at all to work with bigmem pages, since with DMA only userspace and
the hardware device (and not the kernel) will touch the bigmem memory).
BTW, with such per-blockdevice bitflag we could do also swapin/swapout on
DMA devices more efficiently when possible (the swapout thing is really
not interesting though).

NOTE: the above is what IMO is _doable_ but that doesn't mean it worth to
do that. I would like to get some more comment about this raw-io/bigmem
issue.

I think you can find other informations about this issue reading my last
email I sent to Stephen and the all bigmem thread I started some week ago.
All emails have linux-kernel in CC.

>Btw, why do you use the full buddy algorithm with 10 lists for bigmem?

Yes.

>currently, only single pages are allocated, so 1 or 2 lists should be
>sufficient.

I think it would be a bad idea on the long run to use only 1 or 2 lists.
Right now one list would be ok, but it would be really dirty. Maybe in the
future we may want to use bigmem memory for other things and I don't want
to break the GFP_BIGMEM just to save some CPU cycle.

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/