Re: [PATCH] [RFC] List per-process file descriptor consumption when hitting file-max

From: Alexander Shishkin
Date: Sun Oct 11 2009 - 08:18:04 EST


2009/7/30 <Valdis.Kletnieks@xxxxxx>:
> On Wed, 29 Jul 2009 19:17:00 +0300, Alexander Shishkin said:
>>Is there anything dramatically wrong with this one, or could someone please review this?
>
>
>> + Â Â Â Â Â Â Â for_each_process(p) {
>> + Â Â Â Â Â Â Â Â Â Â Â files = get_files_struct(p);
>> + Â Â Â Â Â Â Â Â Â Â Â if (!files)
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â continue;
>> +
>> + Â Â Â Â Â Â Â Â Â Â Â spin_lock(&files->file_lock);
>> + Â Â Â Â Â Â Â Â Â Â Â fdt = files_fdtable(files);
>> +
>> + Â Â Â Â Â Â Â Â Â Â Â /* we have to actually *count* the fds */
>> + Â Â Â Â Â Â Â Â Â Â Â for (count = i = 0; i < fdt->max_fds; i++)
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â count += !!fcheck_files(files, i);
>> +
>> + Â Â Â Â Â Â Â Â Â Â Â printk(KERN_INFO "=> %s [%d]: %d\n", p->comm,
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â p->pid, count);
>
> 1) Splatting out 'count' without a hint of what it is isn't very user friendly.
> Consider something like "=> %s[%d]: open=%d\n" instead, or add a second line
> to the 'VFS: file-max' printk to provide a header.
Fair enough.

> 2) What context does this run in, and what locks/scheduling considerations
> are there? On a large system with many processes running, this could conceivably
> wrap the logmsg buffer before syslog has a chance to get scheduled and read
> the stuff out.
That's a good point.

> 3) This can be used by a miscreant to spam the logs - consider a program
> that does open() until it hits the limit, then goes into a close()/open()
> loop to repeatedly bang up against the limit. Every 2 syscalls by the
> abuser could get them another 5,000+ lines in the log - an incredible
> amplification factor.
>
> Now, if you fixed it to only print out the top 10 offending processes, it would
> make it a lot more useful to the sysadmin, and a lot of those considerations go
> away, but it also makes the already N**2 behavior even more expensive...
That's a good idea. I think some kind of rate-limiting can be applied here too.

> At that point, it would be good to report some CPU numbers by running a abusive
> program that repeatedly hit the limit, and be able to say "Even under full
> stress, it only used 15% of a CPU on a 2.4Ghz Core2" or similar...
I'll see what I can do.
Thanks for your comments and ideas!

Regards,
--
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/