Re: Pathnames in /proc -- choose a quoting convention

Peter Swain (swine@softway.com.au)
Tue, 23 Jun 1998 18:09:18 +1000 (EST)


H.Peter Anvin raised an interesting point wrt disambiguating filenames
embedded in /proc/*...

hpa> There are a number of /proc files which contain filenames, including
hpa> /proc/mounts. With the new dentry stuff these can be made
hpa> kernel-enforced reliable, which is a Good Thing[TM]. However, these
hpa> are typically delimitered by whitespace or newlines, which isn't good,

that's a v.brave change to make, it will niggle many paranoid people.
but it's a GoodThing [TM].
and almost nothing will notice!

currently /proc/*/env (and others?) use this convention, and i like it.

i can see this exploding into flames, so i'll piss on the fire before it starts:

we could ignore the problem, leaving /proc/* cosmetic but unreliable, or
we could quote things consistently in (exactly one of) a number of ways:
- space\020in\020name
- space\ in\ name
- 'space in name'
- space%20in%20name
- {space in name}
- space in name\0
each of these is friendly to some class of userland tools
and a pain to others, and (except for the last) to the kernel.

adopting any of them will require a set of tools for parsing the quoted names,
rendering them palatable to bash,perl,tcl,emacs,C,.... [(N-1) little tools]
Yeah, it's a one-liner in most things, but that's the kind of quich hack that
goes subtly wrong in .03 of cases (like my spelling above).

The format all toosets have in common (internally, if not user-visible),
and have all confronted before, is C's null-terminated strings,
as hpa suggested.

it makes handling some things more difficult, but it is the easiest to handle
in the common extension language of all open-source tools (C).

If sed/grep/awk/perl/emacs/tcl/favourite_tool gets confused,
just deconfuse them. use the src, luke.
most (all?) of the above have been fixed.

So when, upon a rainy midnight, you discover your favourite tool_or_script
can't hack the new format, there are 2 classes of error to be made while
adding the capability to the underlying null-shy tool:
- C fragment failing to parse the filename field into null-term'd string;
- C fragment failing to wrap it up as a favourite_toolset-friendly string.
the first is trivial, so is broken never-or-always, not just on obscure cases.
the 2nd is a standard idiom in any toolset.

^..^ go for it -- use \0 as god intended!
(qp) just don't get carried away, using \0\0 as eof.

(dare i suggest that /proc/* examine reader's /proc/self/env/LC_LOCALE,
and return {} quoting for LC_LOCALE=tcl, and etc....
in a different universe, with a html_content_negotiation_filesys, maybe.
is it unix? no.
what's the best underlying convention to base such wild polymorphism on? \0
)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu