Re: Mmap device performance

Jeremy Fitzhardinge (
Fri, 12 Dec 1997 01:02:07 +1100

Daryll Strauss wrote:
> On a Pentium box this seems to perform OK, but not great. The
> performance is slightly (a few percent) slower than the Windows
> code. I've got a simple application just pushs a single triangle to the
> board a bunch of times in a loop. At the application level, the call to
> draw a triangle is just a jmp to some hand written assembly code, so the
> applications are essentially identical.
> On the PPro and PII boxes the performance is roughly half what it is
> under Windows. Since the application is identical, and the interface to
> the card is simply writing to a memory mapped region so it seems that
> the degradation must be happening somewhere in the kernel layers. I'm
> currently mapping the region shared with read/write access.

I wonder if you're getting TLB thrashing? Does the software touch a
wide range of addresses? It could be that the Win 95 driver is using a
single 4MB page mapping, while under Linux its getting a whole pile of
4k pages, each of which will need its own TLB entry. If your memory
accesses bounce all over the place in virtual memory, you could be
trashing the mappings in your TLB. Is it possible to use the 4MB page
extention for mapping the hardware into a process address space? Does X
use it for mapping in video-cards?

On the other hand, if it is using 4MB pages, it could do it with the
plain Pentium too, so you'd see a similar speed effect there. Except
that servicing a TLB-miss might be relatively more expensive on a
PPro/PII - this would be consistent with your Pentium being a little
slower, but the PPro/PII being much slower.

I gather the PII/PPro has a comprehensive set of registers for
monitoring performance problems like this, including a count of TLB
misses, cache misses, pipeline bubbles and so on. Maybe you can use
them to get a grip on what's happening. I think the Pentium only has
cycle times, but that would at least tell you whether your code is
taking longer than it should (though you know that already, I guess).