next-20130206 x86_64- high CPU usage, spinlock issue

From: Valdis Kletnieks
Date: Mon Feb 18 2013 - 16:13:21 EST


Over the weekend, I had a number of occurrences of my laptop becoming
unresponsive for periods of up to several minutes. gkrellm monitors showed
near 100% system time for both CPUs (and the way X and other userspace programs
were behaving was consistent with them being starved for CPU for extended
periods - at one point gkrellm was unable to update the display for over 4
minutes). The problem would hit, everything would get sluggish/stop for
anywhere from 15 seconds or so to several minutes - and then it would just
as mysteriously go away for no apparent reason, only to return a bit later...

The only other clue I was able to get was running 'perf top' to see where
the CPU was going - anytime the kernel was in this state, I'd see the top
two entries were consistently delay_tsc and do_raw_read_lock:

25.29% [kernel] [k] delay_tsc
23.47% [kernel] [k] do_raw_read_lock
8.01% [kernel] [k] ext4_es_reclaim_extents_count
7.44% [kernel] [k] ftrace_likely_update
2.80% [kernel] [k] lock_release
2.79% [kernel] [k] do_raw_read_unlock
2.65% [kernel] [k] lock_acquire

15.33% [kernel] [k] delay_tsc
14.02% [kernel] [k] do_raw_read_lock
6.80% libc-2.17.so [.] re_search_internal
5.55% libmagic.so.1.0.0 [.] 0x000000000000af9a
5.16% [kernel] [k] ftrace_likely_update
4.79% [kernel] [k] ext4_es_reclaim_extents_count

22.94% [kernel] [k] delay_tsc
19.84% [kernel] [k] do_raw_read_lock
6.92% [kernel] [k] ext4_es_reclaim_extents_count
6.79% [kernel] [k] ftrace_likely_update
2.62% libc-2.17.so [.] re_search_internal

I'm not sure why ext4_es-reclaim_extents_count appears a lot - the workload
that was running at the time was seeing some memory pressure due to a large
Firefox process, and a 'find | xargs file' command causing a lot of inode and
data read I/O, but there shouldn't have been a lot of reclaims of extents as
not much was being written anyplace?

I can try to reproduce the issue - any suggestions how to efficiently tell
who/what is doing the contending in do_raw_read_lock?

Attachment: pgp00000.pgp
Description: PGP signature