Re: [oss-security] [patch] proc.5: tell how to parse /proc/*/stat correctly

From: Alejandro Colomar
Date: Wed Dec 28 2022 - 14:58:04 EST


Hi all,

On 12/28/22 20:24, Shawn Webb wrote:
On Wed, Dec 28, 2022 at 01:02:35PM -0500, Demi Marie Obenour wrote:
On Wed, Dec 28, 2022 at 12:25:17PM -0500, Shawn Webb wrote:
On Wed, Dec 28, 2022 at 11:47:25AM -0500, Demi Marie Obenour wrote:
On Wed, Dec 28, 2022 at 10:24:58AM -0500, Shawn Webb wrote:
On Tue, Dec 27, 2022 at 04:44:49PM -0800, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote:
Dominique Martinet writes:

But, really, I just don't see how this can practically be said to be parsable...

In its current form it never will be. The solution is to place
this variable-length field last. Then you can "cut -d ' ' -f 51-"
to get the command+args part (assuming I counted all those fields
correctly ...)

Of course, this breaks backwards compatability.

It would also break forwards compatibility in the case new fields
needed to be added.

The only solution would be a libxo-style feature wherein a
machine-parseable format is exposed by virtue of a file extension.

Examples:

1. /proc/pid/stats.json
2. /proc/pid/stats.xml
3. /proc/pid/stats.yaml_shouldnt_be_a_thing

A binary format would be even better. No risk of ambiguity.

I think the argument I'm trying to make is to be flexible in
implementation, allowing for future needs and wants--that is "future
proofing".

Linux should not have an XML, JSON, or YAML serializer. Linux already
does way too much; let’s not add one more thing to the list.

Somewhat agreed. I think formats like JSON provide a good balance
between machine parseable and human readable.
a
As I described earlier, though, when it comes to concepts like procfs
and sysfs, I have a bias towards abandoning them in favor of sysctl.
If sysctl nodes were to be used, no new serialization formats would
need to be implemented--and developers would also use a safter method
of system and process inspection and manipulation.


Just a comment as someone who is reading without much understanding of the contents of /prod/pid/stat:

If organization of the data in the file is a problem, and the format starts to matter, maybe it's a hint that there are too many different contents, and could be split into different files, each one with its own formatting rules. I'll suggest that maybe a set of files, maybe contained in a common directory stats.d, is what you're looking for?

Binary format is not of my preference, since most user-space tools work with the standard interface, that is, text.

Cheers,

Alex

--
<http://www.alejandro-colomar.es/>

Attachment: OpenPGP_signature
Description: OpenPGP digital signature