Re: [PATCH] [CPUFREQ] fix race condition in store_scaling_governor

From: AmÃrico Wang
Date: Thu May 13 2010 - 05:09:25 EST


On Thu, May 13, 2010 at 01:58:14AM +0200, Andrej Gelenberg wrote:
>Hi,
>
>On 05/13/2010 12:00 AM, Andrew Morton wrote:
>>
>>Looks sane, I guess.
>>
>>I am afraid of moving all those functions inside
>>cpufreq_governor_mutex. Not for any specific reason, apart from a long
>>history of nasty deadlocks with cpufreq global locks :(
>>
>>Has this change been well-tested with lockdep enabled?
>
>It prevent at least the kernel panic and warnings from sysfs,
>but cause a deadlock. I can confirm the bug in 2.6.33-ARCH (last
>stable kernel in archlinux):
>


Well, this is not a panic, it is just a WARNING.


>------------[ cut here ]------------
>WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0xc5/0x150()
>Hardware name: 287655G
>sysfs: cannot create duplicate filename
>'/devices/system/cpu/cpu0/cpufreq/ondemand'
>Modules linked in: cpufreq_conservative cpufreq_ondemand powernow_k8
>freq_table joydev radeon ttm drm_kms_helper snd_seq_dummy uvcvideo
>drm videodev rfkill i2c_algo_bit snd_seq_oss v4l1_compat usb_storage
>v4l2_compat_ioctl32 snd_seq_midi_event led_class snd_seq
>snd_seq_device nvram snd_hda_codec_conexant snd_hda_intel video
>snd_pcm_oss snd_mixer_oss output snd_hda_codec snd_hwdep snd_pcm
>snd_timer snd ohci_hcd soundcore shpchp ehci_hcd ac wmi battery sg
>thermal processor button snd_page_alloc psmouse i2c_piix4 edac_core
>pci_hotplug r8169 usbcore mii edac_mce_amd serio_raw i2c_core k8temp
>evdev pcspkr rtc_cmos rtc_core rtc_lib ext4 mbcache jbd2 crc16 cryptd
>aes_x86_64 aes_generic xts gf128mul dm_crypt dm_mod sd_mod ahci
>libata scsi_mod
>Pid: 3136, comm: test_cpu.sh Tainted: G W 2.6.33-ARCH #1
>Call Trace:
> [<ffffffff810529f6>] warn_slowpath_common+0x76/0xb0
> [<ffffffff81052a8c>] warn_slowpath_fmt+0x3c/0x40
> [<ffffffff81187f45>] sysfs_add_one+0xc5/0x150
> [<ffffffff81188033>] create_dir+0x63/0xc0
> [<ffffffff811880a6>] sysfs_create_subdir+0x16/0x20
> [<ffffffff8118950a>] internal_create_group+0x5a/0x190
> [<ffffffff8118966e>] sysfs_create_group+0xe/0x10
> [<ffffffffa056fcfc>] cpufreq_governor_dbs+0xac/0x3e0 [cpufreq_ondemand]
> [<ffffffff810788bd>] ? notifier_call_chain+0x4d/0x70
> [<ffffffff81293f25>] __cpufreq_governor+0xf5/0x1e0
> [<ffffffff812954ec>] __cpufreq_set_policy+0x13c/0x180
> [<ffffffff812958f8>] store_scaling_governor+0xe8/0x220
> [<ffffffff81296240>] ? handle_update+0x0/0x10
> [<ffffffff811cb7ba>] ? kobject_get+0x1a/0x30
> [<ffffffff81295382>] store+0x62/0x90
> [<ffffffff81186820>] sysfs_write_file+0xe0/0x160
> [<ffffffff81121576>] vfs_write+0xb6/0x190
> [<ffffffff8103175d>] ? do_page_fault+0x15d/0x320
> [<ffffffff811218ac>] sys_write+0x4c/0x80
> [<ffffffff81009f02>] system_call_fastpath+0x16/0x1b
>---[ end trace 939cd7811bc2accf ]---
>

Hmm, so two processes enter store_scaling_governor() at
the same time, one will enter mutex_lock(&dbs_mutex);
while the other one is blocking, when that one leaves
mutex_unlock(&dbs_mutex), the other one enters.

Yeah, makes sense, but I am still not sure if we could
reuse this cpufreq_governor_mutex...

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/