Mmap device performance

Daryll Strauss (daryll@d2.com)
Wed, 10 Dec 1997 09:29:43 -0800


I've been working with the 3Dfx graphics cards. They are PCI cards. The
way that an application/driver talks to these cards is by memory mapping
the card and then writing triangle data into the region. My driver
provides a mmap interface which maps the correct region on the card. To
do this I slightly modified the mmap code from /dev/mem so that it only
maps the correct region.

On a Pentium box this seems to perform OK, but not great. The
performance is slightly (a few percent) slower than the Windows
code. I've got a simple application just pushs a single triangle to the
board a bunch of times in a loop. At the application level, the call to
draw a triangle is just a jmp to some hand written assembly code, so the
applications are essentially identical.

On the PPro and PII boxes the performance is roughly half what it is
under Windows. Since the application is identical, and the interface to
the card is simply writing to a memory mapped region so it seems that
the degradation must be happening somewhere in the kernel layers. I'm
currently mapping the region shared with read/write access.

A few other facts. This is with 2.0.28-32 versions of the kernel. The
card is fast back to back capable. /proc/pci tells me that as
well. Someone suggested that maybe I wasn't taking advantage of that.

I'm fairly new to this level of hardware interfacing, so perhaps I've
missed something obvious. Any suggestions from you gurus out there?

Thanks,
- |Daryll