2.6.15.4: NFS-related BUG in radix_tree_tag_set()

From: Max Kellermann
Date: Thu Mar 02 2006 - 08:04:42 EST


Hi,

after a kernel upgrade from 2.6.11 to 2.6.15.4, we were experiencing
crashes on all four web servers. These web servers obtain their data
from NFSv3 from a NetApp server. The servers were under heavy load -
mostly reading, but also a lot of writing to NFS.

Hardware: Compaq ProLiant with two (physical) Xeon 2.4 CPUs, 4 GB
memory, Broadcom Tigon3 network interfaces. Kernel config is appended
to this mail.

After one of the crashes, an administrator made a screenshot
(http://www.duempel.org/~max/linux/nfs_radix_tree_crash.png) and
rebooted. Unfortunately, part of the stack trace is missing (25 lines
console only), and I had no access to the KDB console. I am currently
waiting for the next crash to happen so I can provide more
information.

The BUG_ON() failed in lib/radix-tree.c:372 :

slot = slot->slots[offset];
BUG_ON(slot == NULL);

I believe the missing stack trace calls are nfs_mark_request_dirty(),
nfs_flush_one(), nfs_flush_list(), nfs_flush_inode().

That would mean that req->wb_index was somehow removed from
nfsi->nfs_page_tree, maybe in another thread on another CPU? I see
the spinlock nfsi->req_lock is only held for very short timespans - is
it possible that another CPU tries to flush the same NFS write request
which is currently in the middle of being handled by the first CPU?
Any other explanation?

Max

Attachment: .config.gz
Description: Binary data