Re: System slow down from udev

From: Rafael J. Wysocki
Date: Thu May 30 2013 - 10:43:34 EST


On Thursday, May 30, 2013 04:34:51 PM Rafael J. Wysocki wrote:
> [Adding CC to Toshi Kani just in case he has an idea.]
>
> On Wednesday, May 29, 2013 06:55:33 PM Yinghai Lu wrote:
> > On Wed, May 29, 2013 at 4:55 PM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
> > > On Wednesday, May 29, 2013 03:49:38 PM Yinghai Lu wrote:
> > >> On Wed, May 29, 2013 at 2:34 PM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
> > >> > On Wednesday, May 29, 2013 01:13:46 PM Yinghai Lu wrote:
> > >> >> On Wed, May 29, 2013 at 4:29 AM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
> > >> >> > On your systems the processor driver is built-in. Any chance to build it as
> > >> >> > a module and see if that helps?
> > >> >>
> > >> >> it CONFIG_ACPI_PROCESSOR it not set in the config
> > >> >> the boot get to normal speed.
> > >> >
> > >> > Well, if it is not set at all, there won't be problems with it. :-)
> > >> >
> > >> > I've tested my linux-next branch on OpenSUSE 11.3 both with the processor
> > >> > driver built in and modular and I'm not able to reproduce the issue you're
> > >> > seeing.
> > >> >
> > >> > Moreover, I'm not sure if user space is involved here at all, because the
> > >> > problem triggers for you when all of the relevant kernel code is non-modular.
> > >> >
> > >> > With the processor driver enabled, when the slowdown happens, are the systems
> > >> > usable enough to get some debug info out of them?
> > >>
> > >> please check the bootchart data.
> > >>
> > >> looks like it take 200s if no acpi_processor ...
> > >> otherwise will take 800s or more.
> > >
> > > Well, something's fishy for sure.
> > >
> > > To my eyes it looks like we're getting lots of notifications related to the
> > > processor driver and that generates a lot of workqueue load.
> > >
> > > Can you please get /proc/interrupts from both cases and the output of
> > > "find /sys/firmware/acpi/interrupts/ -print -exec cat {} \;"?
>
> Thanks for the info!
>
> > sca05-0a818ce5:~/g5_acpi_driver # find /sys/firmware/acpi/interrupts/
> > -print -exec cat {} \;
> > /sys/firmware/acpi/interrupts/
> > cat: /sys/firmware/acpi/interrupts/: Is a directory
> > /sys/firmware/acpi/interrupts/sci
> > 0
> > /sys/firmware/acpi/interrupts/error
> > 0
> > /sys/firmware/acpi/interrupts/gpe00
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe01
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe02
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe03
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe04
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe05
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe06
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe07
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe08
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe09
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe10
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe11
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe12
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe13
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe14
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe15
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe16
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe0A
> > 0 enabled
> > /sys/firmware/acpi/interrupts/gpe17
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe0B
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe18
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe0C
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe19
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe0D
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe0E
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe20
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe0F
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe21
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe22
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe23
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe24
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe25
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe26
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe1A
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe27
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe1B
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe28
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe1C
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe29
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe1D
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe1E
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe30
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe1F
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe31
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe32
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe33
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe34
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe35
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe36
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe2A
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe37
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe2B
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe38
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe2C
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe39
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe2D
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe2E
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe2F
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe3A
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe3B
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe3C
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe3D
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe3E
> > 0 invalid
> > /sys/firmware/acpi/interrupts/gpe3F
> > 0 invalid
> > /sys/firmware/acpi/interrupts/sci_not
> > 0
> > /sys/firmware/acpi/interrupts/ff_pmtimer
> > 0 invalid
> > /sys/firmware/acpi/interrupts/ff_rt_clk
> > 0 disabled
> > /sys/firmware/acpi/interrupts/gpe_all
> > 0
> > /sys/firmware/acpi/interrupts/ff_gbl_lock
> > 0 enabled
> > /sys/firmware/acpi/interrupts/ff_pwr_btn
> > 0 enabled
> > /sys/firmware/acpi/interrupts/ff_slp_btn
> > 0 invalid
>
> OK, no GPEs. Interesting.
>
> > > Also please send the output of "ls -l /sys/devices/system/cpu/cpu*" with the
> > > processor driver present.
> >
> > sca05-0a818ce5:~/g5_acpi_driver # ls -l /sys/devices/system/cpu/cpu*
> > /sys/devices/system/cpu/cpu0:
> > total 0
> > drwxr-xr-x 6 root root 0 May 30 20:09 cache
> > drwxr-xr-x 5 root root 0 May 30 20:09 cpuidle
> > -r-------- 1 root root 4096 May 30 20:09 crash_notes
> > -r-------- 1 root root 4096 May 30 20:09 crash_notes_size
> > lrwxrwxrwx 1 root root 0 May 30 20:09 driver ->
> > ../../../../bus/cpu/drivers/processor
> > lrwxrwxrwx 1 root root 0 May 30 20:09 firmware_node ->
> > ../../../LNXSYSTM:00/LNXCPU:00
> > lrwxrwxrwx 1 root root 0 May 30 20:09 node0 -> ../../node/node0
> > drwxr-xr-x 2 root root 0 May 30 20:09 power
> > lrwxrwxrwx 1 root root 0 May 30 20:03 subsystem -> ../../../../bus/cpu
> > drwxr-xr-x 2 root root 0 May 30 20:09 thermal_throttle
> > drwxr-xr-x 2 root root 0 May 30 20:09 topology
> > -rw-r--r-- 1 root root 4096 May 30 20:03 uevent
> >
> > /sys/devices/system/cpu/cpu1:
> > total 0
> > drwxr-xr-x 6 root root 0 May 30 20:09 cache
> > drwxr-xr-x 5 root root 0 May 30 20:09 cpuidle
> > -r-------- 1 root root 4096 May 30 20:09 crash_notes
> > -r-------- 1 root root 4096 May 30 20:09 crash_notes_size
> > lrwxrwxrwx 1 root root 0 May 30 20:09 driver ->
> > ../../../../bus/cpu/drivers/processor
> > lrwxrwxrwx 1 root root 0 May 30 20:09 firmware_node ->
> > ../../../LNXSYSTM:00/LNXCPU:01
> > lrwxrwxrwx 1 root root 0 May 30 20:09 node0 -> ../../node/node0
> > -rw-r--r-- 1 root root 4096 May 30 20:09 online
> > drwxr-xr-x 2 root root 0 May 30 20:09 power
> > lrwxrwxrwx 1 root root 0 May 30 20:03 subsystem -> ../../../../bus/cpu
> > drwxr-xr-x 2 root root 0 May 30 20:09 thermal_throttle
> > drwxr-xr-x 2 root root 0 May 30 20:09 topology
> > -rw-r--r-- 1 root root 4096 May 30 20:03 uevent
>
> [Skipping a number of analogous items.]
>
> Well, it seems to initialize correctly at least.
>
> > /sys/devices/system/cpu/cpuidle:
> > total 0
> > -r--r--r-- 1 root root 4096 May 30 20:10 current_driver
> > -r--r--r-- 1 root root 4096 May 30 20:10 current_governor_ro
>
> Well, this shows that my previous suspicion regarding notifications wasn't
> justified, as there are none of them, apparently.
>
> Also the CPUs' directory structures in sysfs look correctly to me. The
> driver binds to the devices it is supposed to bind to and acpi_bind_one()
> works as expected. Hmm.
>
> Let's see if thermal throttling is not going on. Please send the output of:
> $ find /sys/devices/system/cpu/ -name core_throttle_count -print -exec cat {} \;
> $ find /sys/devices/system/cpu/ -name package_throttle_count -print -exec cat {} \;
>
> from the affected systems.
>
> I'll try to dig deeper locally in the meantime.

Actually, I think I know what the problem is, but I need some more time to
debug it. Fortunately, I'm able to see some symptoms. :-)

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/