Re: Kernel 2.6.30 and udevd problem

From: Jesse Barnes
Date: Thu Jul 02 2009 - 12:36:04 EST


On Thu, 2 Jul 2009 08:18:58 +0200
Alberto Gonzalez <alberto6674@xxxxxxxxx> wrote:

> On Wednesday 01 July 2009 19:22:14 Jesse Barnes wrote:
> > On Wed, 1 Jul 2009 09:09:03 +0200
> >
> > Alberto Gonzalez <alberto6674@xxxxxxxxx> wrote:
> > > On Tuesday 30 June 2009 18:08:35 Jesse Barnes wrote:
> > > > On Tue, 30 Jun 2009 04:46:38 +0100 (IST)
> > > >
> > > > Dave Airlie <airlied@xxxxxxxx> wrote:
> > > > > > On Sunday 28 June 2009 16:28:44 Kay Sievers wrote:
> > > > > > > If there isn't something else running which acts on
> > > > > > > uevents that trigger drm events, which I wouldn't expect,
> > > > > > > it seems like a drm kernel problem.
> > > > > >
> > > > > > Ok, thanks for looking at it. I'll sum up the problem for
> > > > > > DRM people:
> > > > > >
> > > > > > The problem started after upgrading to 2.6.30. At some
> > > > > > point, udevd starts to use a lot of CPU time. It happens
> > > > > > randomly, but it seems easier to trigger when running
> > > > > > something graphics intensive (glxgears, gtkperf,
> > > > > > tuxracer..).
> > > > > >
> > > > > > Killing udevd and starting it with the --debug switch
> > > > > > throws up this when the problem starts:
> > > > >
> > > > > I've added jbarnes to the list,
> > > > >
> > > > > Jesses are we sending events yet? what for?
> > > >
> > > > Right now we just send uevents at hotplug time, so maybe one of
> > > > our hotplug interrupt bits is getting stuck, resulting in a
> > > > continuous stream of events as we generate other interrupts
> > > > (which would happen when running 3D apps for example).
> > > >
> > > > There's a DRM_DEBUG statement in drivers/gpu/drm/i915/i915_irq.c
> > > > under the if (I915_HAS_HOTPLUG(dev)) { check, if you make it
> > > > into DRM_ERROR we can see which one is getting stuck.
> > >
> > > I am afraid I'll need a bit more guidance here. I guess this means
> > > patching the kernel. Would it be possible to get a test patch
> > > against 2.6.30? And then after patching and compiling, how should
> > > I debug it?
> >
> > Here's a patch against git, it should apply to 2.6.30 though I
> > think.
> >
> > I'll just need your dmesg output from when the problem is occurring
> > (if I'm right this patch should flood your logs).
>
> Thanks, the patch applied to 2.6.30 and when the problem stars I get
> these two lines repeated all the time in dmesg:
>
> [drm:i915_driver_irq_handler] *ERROR* hotplug event received, stat
> 0x10200300 [drm:i915_driver_irq_handler] *ERROR* hotplug event
> received, stat 0x18200300
>
> Is this enough or should I provide something else?

Interesting. That hotplug status shouldn't end up generating any
uevents. Can you try this patch to see what's going on?

--
Jesse Barnes, Intel Open Source Technology Center
diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index 85ec31b..d4c6d82 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -456,6 +456,8 @@ void drm_sysfs_hotplug_event(struct drm_device *dev)
char *event_string = "HOTPLUG=1";
char *envp[] = { event_string, NULL };

+ WARN(1, "hotplug uevent\n");
+
DRM_DEBUG("generating hotplug event\n");

kobject_uevent_env(&dev->primary->kdev.kobj, KOBJ_CHANGE, envp);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 228546f..29e1e4b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -353,10 +353,12 @@ irqreturn_t i915_driver_irq_handler(DRM_IRQ_ARGS)
(iir & I915_DISPLAY_PORT_INTERRUPT)) {
u32 hotplug_status = I915_READ(PORT_HOTPLUG_STAT);

- DRM_DEBUG("hotplug event received, stat 0x%08x\n",
- hotplug_status);
- if (hotplug_status & dev_priv->hotplug_supported_mask)
+ if (hotplug_status & dev_priv->hotplug_supported_mask) {
+ DRM_DEBUG("hotplug event received: 0x%08x\n",
+ hotplug_status &
+ dev_priv->hotplug_supported_mask);
schedule_work(&dev_priv->hotplug_work);
+ }

I915_WRITE(PORT_HOTPLUG_STAT, hotplug_status);
I915_READ(PORT_HOTPLUG_STAT);