Re: [PATCH][RFC] fix BSD accounting (w/ long-term perspective ;-)

From: Ragnar Kjørstad
Date: Fri Mar 12 2004 - 13:01:44 EST


[ CCing Sam Watters and Linda Walsh at SGI for feedback about using
pid/ppid/pgid/sid/paggid/whatever to track jobs in accounting-logs ]

On Tue, Mar 09, 2004 at 12:03:27AM +0100, Tim Schmielau wrote:
> BSD accounting currently reports only the lower 16 bits of 32 bit uids,
> reports times in units of 1/HZ seconds while it announces a unit of
> 1/100 seconds in the header file (which hurts especially on i386 where
> HZ changed since 2.4) and cannot report times longer that 49 days.

Red Hat uses a patch that adds 32 bit uids and 32 bit gids at the end of
struct acct, reducing the size of the padding to 2 bytes. If a simular
extension is added to the official kernel it would be nice if they were
compatible.


There is also a few other problems with the current implementation that
we would like to see fixed if there is going to be a format-change
anyway:
- Architecture indepencence. It would be nice if the binary format was
independent of the architecture so files can be processed on other
hosts. This would involve defining padding and byte-ordering, as well
as either stop using 1/HZ units in the file or add AHZ to the record.

- Better accuracy. Specificly the records don't include an exact
termination-time for the process. One can estimate termination-time
based on creation time and elapsed time but since elapsed time is
truncated into comp_t it is not accurate for long-lived processes.

- More information. Accounting is typically used to keep track of
what resources different jobs have used, where the definition of
"job" may differ from site to site. Quite often "job" is a group
of related processes, so including pid/ppid would allow
post-processing to rebuild the process-hierarchy and deduce what
processes were part of the same job. Even better, if usespace
could notify the kernel what job a process belongs to, it would be
easier to track jobs across multiple nodes in a cluster and so on.
Maybe adding pid, ppid and jobid to the format and then use the sid
as the jobid for now? And then it can be replaced by a proper jobid
from e.g. PAGG later?

> My proposed solution: keep backwards compatibility with 2.4 by using
> a unit of 1/USER_HZ, announce that correctly in the header file,
> and simply admit that BSD accounting on i386 is broken in 2.6.0-2.6.4.

Sounds good.

> - store a version number in the last byte of struct acct, which allows
> for a smooth transition to a new binary format when 2.7 comes out.
> For 2.7, extend uid/gid fields to 32 bit, report times in terms
> of AHZ=100 on all platforms (thus allowing to report times up to 1988
> days), and remove the compatibility stuff from the kernel.

It may be useful to have the version-information in the beginning of the
struct to allow future versions to use a struct of a different size.
There should be some bits available in the flag-field, maybe that's
enough?


--
Ragnar Kjørstad
Software Engineer
Scali - http://www.scali.com
High Performance Clustering
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/