[PATCH] Add kallsyms_lookup() result cache

From: Brent Casavant
Date: Fri Jun 18 2004 - 15:18:34 EST

Next message: Rik van Riel: "Re: Stop the Linux kernel madness"
Previous message: Ian Morgan: "Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus."
Next in thread: Martin Josefsson: "Re: [PATCH] Add kallsyms_lookup() result cache"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2.6 based systems, the top command utilizes /proc/[pid]/wchan to
determine WCHAN symbol name information. This information is provided
by the kernel function kallsyms_lookup(), which expands a stem-compressed
list of kernel symbols in order to provide a symbol name. The decompression
is relatively slow, performing a lot of throw-away work in the form of
strncpy() invocations whose results are overwritten immediately. On systems
with many processes, this results in the kernel consuming a large amount of
time on top's behalf performing the decompression. For example, on a
large SGI Altix system with 4500 current tasks, there is approximately
8 seconds worth of work to be done with each top screen update.

The following patch adds a result cache to kallsyms_lookup(), cacheing
previously decompressed symbol names and using them on subsequent
kallsyms_lookup() calls. With this change, top with a 1 second update
now consumes only about 60% of a CPU on the previously mentioned sytem,
as opposed to the prior 800%. A trivial test program which repeatedly
opens/reads/closes /proc/1/wchan improves from 160 iterations per second
to 30000.

The cache size of 32 items was chosen experimentally. 16 was too few
to cache all the results on even a small system with approximatley 100
tasks, and 64 was too large for all but the most extreme cases. Cacheing
was chosen instead of improving the expansion algorithm due to the large
number of strlen() invocations which would have been necessary with that
solution.

Further work obviously needs to be done for top itself to reduce CPU
utilization even further, but this is a large first step. Some quick
experiments indicate that the slow path has now moved to other areas
of code, which will be addressed in due course.

Signed-off-by: Brent Casavant <bcasavan@xxxxxxx>

--- linux.orig/kernel/kallsyms.c 2004-03-16 14:13:30.000000000 -0600
+++ linux/kernel/kallsyms.c 2004-06-18 14:28:17.000000000 -0500
@@ -56,6 +56,16 @@
return module_kallsyms_lookup_name(name);
}

+/* Cache of uncompressed symbol names from kallsyms_lookup() */
+#define NAMECACHE_SIZE 32
+static int namecache_index;
+static struct {
+ unsigned long addr;
+ unsigned long nextsym;
+ char name[127];
+} namecache[NAMECACHE_SIZE];
+static rwlock_t namecache_lock;
+
/* Lookup an address. modname is set to NULL if it's in the kernel. */
const char *kallsyms_lookup(unsigned long addr,
unsigned long *symbolsize,
@@ -74,6 +84,22 @@
unsigned long symbol_end;
char *name = kallsyms_names;

+ /* Search for a cached response */
+ read_lock(&namecache_lock);
+ for (i = 0; i < NAMECACHE_SIZE; i++) {
+ if (namecache[i].addr &&
+ namecache[i].addr <= addr &&
+ addr < namecache[i].nextsym) {
+ strncpy(namebuf, namecache[i].name, 127);
+ *symbolsize = namecache[i].nextsym - namecache[i].addr;
+ *modname = NULL;
+ *offset = addr - namecache[i].addr;
+ read_unlock(&namecache_lock);
+ return namebuf;
+ }
+ }
+ read_unlock(&namecache_lock);
+
/* They're sorted, we could be clever here, but who cares? */
for (i = 0; i < kallsyms_num_syms; i++) {
if (kallsyms_addresses[i] > kallsyms_addresses[best] &&
@@ -99,6 +125,15 @@
*symbolsize = symbol_end - kallsyms_addresses[best];
*modname = NULL;
*offset = addr - kallsyms_addresses[best];
+
+ /* Cache the results */
+ write_lock(&namecache_lock);
+ namecache[namecache_index].addr = kallsyms_addresses[best];
+ namecache[namecache_index].nextsym = symbol_end;
+ strncpy(namecache[namecache_index].name, namebuf, 127);
+ namecache_index = (namecache_index + 1) % NAMECACHE_SIZE;
+ write_unlock(&namecache_lock);
+
return namebuf;
}

--
Brent Casavant bcasavan@xxxxxxx Forget bright-eyed and
Operating System Engineer http://www.sgi.com/ bushy-tailed; I'm red-
Silicon Graphics, Inc. 44.8562N 93.1355W 860F eyed and bushy-haired.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Rik van Riel: "Re: Stop the Linux kernel madness"
Previous message: Ian Morgan: "Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus."
Next in thread: Martin Josefsson: "Re: [PATCH] Add kallsyms_lookup() result cache"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]