Re: Intel graphics CPU usage - SDVO detect bogosity?

From: Andy Lutomirski
Date: Mon Aug 16 2010 - 00:12:54 EST


Linus Torvalds wrote:
I started wondering why 'top' was showing an otherwise idle system as
having a load average of 0.5+, and worker threads constantly using the
CPU.

So I did a system-wide profile, and got the attached output (look at
it in a really wide terminal).

There seems to be something _seriously_ wrong with i915 SDVO detect.
This is on an Apple Mac Mini (hey, your favorite problem child!), and
apparently it spends 20% of its non-idle CPU time just doing udelay's
for the i2c SDVO connection detection.

You might be hitting the infamous hotplug storm [1]. The symptoms vary by kernel version.

2.6.34 and before: udevadm --monitor shows craploads of events and, as long as X is running, X keeps reprobing the outputs which (depending on the particular bug) can suck cpu in the i2c code or cause more hotplug events. It also makes X oddly laggy.

2.6.35 and newer: The kernel is smart enough to probe outputs itself before telling X, so the events never hit userspace. But things still can get a bit laggy.

Anyone know why merely *reading* /sys/class/drm/whatever/status causes the output to get probed? (I see it in the code, but I have no idea why that code's still there after most of the rest of the hotplug code got cleaned up in 2.6.36).

Once I find some free time, I plan on trying to at least fix the issue that causes this bug for me. (It's apparently quite nontrivial due to silliness in the way dock/undock (!) works on some laptops.)


[1] for example: http://www.mail-archive.com/intel-gfx@xxxxxxxxxxxxxxxxxxxxx/msg00921.html


That sounds a bit wrong, doesn't it?

I don't know how recent this is - it might have been going on for some
time without me noticing. It's the wife's computer, and the same thing
doesn't seem to happen on my Core i5 desktop

Any ideas? Any information I can give about the machine?

If I'm right, the outputs of intel_bios_dumper and intel_bios_reader could be instructive (both are in intel-gpu-tools).

You could also try intel_reg_write 0x61110 0x0 and see if the problem stops (at least until a suspend/resume cycle). That command turns off output hotplug on the card, which has the side effect that the kernel will stop acting on bogus interrupts.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/