Re: [PATCH] sysfs poll should keep the poll rule of normal regular file.

From: Neil Brown
Date: Wed Apr 08 2009 - 20:16:01 EST


On Wednesday April 8, kosaki.motohiro@xxxxxxxxxxxxxx wrote:
>
> Currently, following test programs don't finished.
>
> % ruby -e '
> Thread.new { sleep }
> File.read("/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies")
> '
>
> strace expose the reason.
>
> ...
> open("/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies", O_RDONLY|O_LARGEFILE) = 3
> ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbf9fa6b8) = -1 ENOTTY (Inappropriate ioctl for device)
> fstat64(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
> _llseek(3, 0, [0], SEEK_CUR) = 0
> select(4, [3], NULL, NULL, NULL) = 1 (in [3])
> read(3, "1400000 1300000 1200000 1100000 1"..., 4096) = 62
> select(4, [3], NULL, NULL, NULL
>
>
> Because Ruby (the scripting language) VM assume select system-call against regular file don't block.
> (POSIX gurantee it.)
> But sysfs_poll() don't keep this rule although sysfs file can read and write always.

It would be nice to include a reference to where POSIX (or SUS)
guarantees it - though I suspect you are right.

I have one piece of code that this would break, but it isn't released
yet and it is never too late to fix things..

However it should be pointed out that /proc/mounts has exactly the
same problems (/proc/mdstat doesn't - I guess I was lucky enough to
get that one right). So we should "fix" /proc/mounts
(fs/proc/base.c:mounts_poll) at the same time. Assuming that won't
break anything.

Al: do you have an opinion about changing mounts_poll to always
report 'readable' to poll?? What would break?

NeilBrown

>
> This patch restore proper poll behavior to sysfs.
> /sys/block/md*/md/sync_action polling application and another sysfs updating sensitive
> application still can use POLLERR and POLLPRI.
>
>
>
> Cc: Neil Brown <neilb@xxxxxxx>
> Cc: Greg Kroah-Hartman <gregkh@xxxxxxx>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
> --
> fs/sysfs/file.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
> index 289c43a..4a302f8 100644
> --- a/fs/sysfs/file.c
> +++ b/fs/sysfs/file.c
> @@ -446,11 +446,11 @@ static unsigned int sysfs_poll(struct file *filp, poll_table *wait)
> if (buffer->event != atomic_read(&od->event))
> goto trigger;
>
> - return 0;
> + return DEFAULT_POLLMASK;
>
> trigger:
> buffer->needs_read_fill = 1;
> - return POLLERR|POLLPRI;
> + return DEFAULT_POLLMASK|POLLERR|POLLPRI;
> }
>
> void sysfs_notify_dirent(struct sysfs_dirent *sd)
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/