/proc field delimiter, etc

Clayton Weaver (cgweav@eskimo.com)
Fri, 30 Oct 1998 14:18:38 -0800 (PST)


I would suggest a file called "/proc/binary/meta". First byte is 0
for little-endian, 0xff for big-endian, which is not confusable even
if you remote mount /proc on a machine with a different host byte order.
Next three bytes are version number without "." characters, starting
at 100 perhaps. So second version is 101, third is 102, i.e. "implied
dots".

Could the schema just hold byte offsets to the binary fields?

meminfo {
# memtot memused memfree memshared buffers cache
# swaptot swapused swapfree
0 %lu 4 %lu 8 %lu 12 %lu 16 %lu 20 %lu 24 %lu 28 %lu 32 %lu
# while(*c != (char) '}') ...
}

For a binary layout like the above one could just hard code the offsets
in user-space software, which would already have to know the field
order for labelling a table, but what if the field value is something
like the contents of /proc/cmdline?

cmdline {
# "boot_image options" root_device_num
0 %s (off_t to root device number) %lu # device numbers probably
# aren't %lu, but you get the idea
}

The offset to the device number would have to be calculated, one
couldn't hard-code it in the schema and distribute it to other developers.

One can put the off_t to the next field in the 4 bytes before a field
in the binary files (last one gets a zero off_t), thus calculating
the length of the current field from SEEK_CUR + length - 4. If we
know a file never has variable length octet sets in it, no problem,
we know where the offsets are a priori. Otherwise, we use the
off_t delimiters.

--

ASN.1 descriptions are clean for the user, but kinda heavyweight for internal kernel code to have to parse. Here's an example, the MIB-II schema for an ip address table for snmp v2. The actual layout of the object is like this:

20 ipAddrTable 1 ipAddrEntry # ip address entry 1 ipAdEntAddr # ip address entry address (the actual ip address) 2 ipAdEntIfIndex # ip address entry interface index (interface # number, like eth0) 3 ipAdEntNetMask # net mask 4 ipAdEntBcastAddr # broadcast address 5 ipAdEntReasmMaxSize # max size of packet that can be reassembled # from fragments on this system 2 ipAddrEntry ...

Finding the ASN.1 data type definitions for all of these fields is something of a paper chase through the rfcs, of course, but they are fairly simple once you find them, something like

ipAddrTable OBJECT-TYPE SYNTAX SEQUENCE OF IpAddrEntry ACCESS not-accessible # can't retrieve table, only values in it (?) STATUS mandatory DESCRIPTION "..." ::= {ip 20} # the "20" indexes into a "naming tree" of object types

#object type declaration for MIB

ipAddrEntry OBJECT-TYPE SYNTAX IpAddrEntry # note the I case is different; object declaration ACCESS not-accessible # vs object definition STATUS mandatory DESCRIPTION "..." INDEX {ipAddrEntAddr} ::= {ipAddrTable 1}

# the template for the above object type

IpAddrEntry ::= SEQUENCE { ipAdrEntAddr IpAddress, ipAdEntIfIndex INTEGER, ipAdEntNetMask IpAddress, ipAdEntBcastAddr INTEGER, ipAdEntReasmMaxSize INTEGER(0..65535) }

And so on until all of the components are defined down to the most primitive data types. ASN.1 has counter and date data types, too.

These objects are all supposed to be in network byte order for SNMP, which is perhaps inappropriate for /proc, since user software could convert into whatever byte order it wants from host byte order without burdening the kernel with the conversions on little endian hardware.

I'm all for organization over ad hoc file formats, but minimizing kernel bloat and /proc complexity is certainly an equally valid issue.

Regards, Clayton Weaver <mailto:cgweav@eskimo.com> (Seattle)

PS I'll see how it looks in PDB and netCDF, but those are "portable binary" apis, not disk formats. The Clog web page I have but haven't been able to find again online, so I'm wondering if it is legally encumbered.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/