Profiling/finding bottleneck in kernel code

B. James Phillippe (bryan@terran.org)
Tue, 30 Mar 1999 16:44:21 -0800 (PST)


Greetings,

I have been working on a project for the last several months to improve
performance of a (dated) IPSec stack for Linux 2.0. One of the steps I
took was to move to hardware encryption, using Rainbow's hardware
accelerator based on the Hifn 7711 encryption chip. Rainbow provided a
Linux driver for their card, but it was working synchronously in that the
kernel would have to busy-wait for the interrupt to complete for each
operation. I made an asynchronous version where the request is simply
submitted or queued and then ISR resumes IPSec packet processing from the
bottom half handler. This allows several packets to be received and queued
during the bh operation. The design seemed quite sound to me, so you can
imagine my surprise at finding the performance figures were lower than for
synchronous mode or even software only! :( I know there must be a fairly
straight-forward problem in the mechanics of the setup, but I'm not sure
how I can profile it. Is there any good technique the experts can
recommend for me to find where the most time is being spent in the routine?
I am tempted to printk jiffy counts at key locations (save the values, then
print them later so the printk doesn't interfere). Any other ideas?

thanks,
-bp

--
B. James Phillippe		. bryan@terran.org
Software Engineer, WGT Inc.	. http://www.terran.org/~bryan

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/