[patch 5/6] statistics infrastructure - documentation

From: Martin Peschke
Date: Wed Dec 14 2005 - 11:47:35 EST


... remainder of patch 5/6

[patch 5/6] statistics infrastructure

This is the documentation for developers.

Signed-off-by: Martin Peschke <mp3@xxxxxxxxxx>
---

00-INDEX | 2
statistics.txt | 629 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 631 insertions(+)

diff -Nurp e/Documentation/00-INDEX f/Documentation/00-INDEX
--- e/Documentation/00-INDEX 2005-12-14 12:51:13.000000000 +0100
+++ f/Documentation/00-INDEX 2005-12-14 14:22:54.000000000 +0100
@@ -264,6 +264,8 @@ stable_kernel_rules.txt
- rules and procedures for the -stable kernel releases.
stallion.txt
- info on using the Stallion multiport serial driver.
+statistics.txt
+ - info on statistics infrastructure available for drivers and others
svga.txt
- short guide on selecting video modes at boot via VGA BIOS.
sx.txt
diff -Nurp e/Documentation/statistics.txt f/Documentation/statistics.txt
--- e/Documentation/statistics.txt 1970-01-01 01:00:00.000000000 +0100
+++ f/Documentation/statistics.txt 2005-12-14 14:22:54.000000000 +0100
@@ -0,0 +1,629 @@
+
+ Statistics infrastructure
+
+0. Which problems is it meant to solve?
+1. Concept
+2. Advantages of a generic approach
+3. Disadvantages of a generic approach
+4. Features
+5. User interface
+6. Programming interface
+7. Possible future enhancements / known bugs
+8. Contact
+
+
+
+
+ 0. Which problems is it meant to solve?
+
+This common code layer implements statistics in a device driver independent
+and architecture independent way.
+
+It targets kernel programmers (mostly device driver programmers) who think about
+providing statistics for their driver and devices, and who look for an
+alternative to the reinvent-the-wheel approach.
+
+This layer tries to minimize the effort imposed on programmers implementing
+statistics. It is not meant to support the proliferation of unnecessary
+statistics, though.
+
+It also tries to solve the problem of different user interfaces for the various
+device drivers implementing statistics.
+
+
+
+
+ 1. Concept
+
+ Overview
+
+The following figure tries to depict how the statistics infrastructure
+fits into the global picture, and how it interacts with both exploiting
+kernel code as well as users.
+
+ USER : KERNEL
+ :
+ user statistics programming
+ interface infrastructure interface exploiter
+ : +------------------+ : +-----------------+
+ : | process data and | : | collect and |
+ "data" : | provide output | (X, Y) | report data |
+ <====================| to user |<==============| as (X, Y) pairs |
+ file : | ^ | : | |
+ : | ^ | : | |
+ : | ^ | : | |
+ : | ^ | : | create/discard |
+ "definition" : | display settings | : | statistics, |
+ <===================>| and accept |<==============| provide default |
+ file : | changed settings | : | settings |
+ : +------------------+ : +-----------------+
+ : :
+
+Actual semantics of the data that feeds a statistic is unimportant when it
+comes to data processing. All that matters is how the user wants the data to
+be presented (counters, histograms, and so on). That's something that can be
+dealt with by a generic layer without intervention by the device driver
+being the source of data.
+
+It's like a cow giving milk. Every time some milk has been gathered
+it is rushed to the dairy. Its not the cow's business to worry about
+the various products made of milk and how these products are packaged
+and delivered. Nor is it the business of sheep or goat, for that
+matter. It's a question of customer demand observed by the dairy
+whether milk is sold as cheese, yogurt, cream,... or simply milk.
+
+
+ The role of the statistics infrastructure
+
+It is the statistic infrastructure's task to accept or drop, accumulate,
+compute and store, as well as display statistics data according to the
+current settings.
+
+
+ The role of exploiters
+
+It is the exploiter's (e.g. device driver's) responsibility to feed the
+statistics infrastructure with sampled data for the statistics maintained by the
+statistics infrastructure on behalf of the exploiter.
+
+It would be nice of any exploiter to provide a default configuration for each
+statistic that most likely works best for general purpose use.
+
+
+ The role of users
+
+It is the user's freedom to configure how accumulation, computation and
+display of data are done, according to their needs.
+
+
+ How data is reported
+
+Exploiters report data bits in the form of (X, Y) with X being a numerical
+value for the main characteristic of the statistic, like a request size or
+request latency, and with Y being a qualifier for that characteristic,
+for example occurrence of a particular X-value.
+
+
+ How data is processed
+
+(X, Y) pairs can be processed in different ways by the statistics
+infrastructure, according to the current settings applicable to a
+particular statistic.
+
+For example, Y-parts can summed up using a simple counter, which,
+in case of a (request size, occurrence)-statistic, would yield the
+total of requests observed.
+
+Or, a simple counter can be the sum of all X*Y products, which,
+in case of a (request size, occurrence)-statistic, would yield the
+total of bytes transfered.
+
+Or, the X-value could be used, sticking to the example above, to distinguish
+request sizes and, therewith, to determine the occurrence of certain request
+sizes. This can be accomplished by the histogram functions of the statistics
+infrastructure.
+
+There are more ways of data processing implemented.
+All these algorithms are documented as part of the programming interface.
+Please note that the statistics infrastructure does not care about the
+actual semantics of (X, Y), an that it just adheres to abstract rules
+describing what to do with (X, Y) pairs for certain settings.
+It is up to the user to interpret processed data, to add semantics
+back to it, and to choose settings and, therewith, data processing modes
+according to their needs.
+
+
+ More design points
+
+Statistics can be grouped within "interfaces" (debugfs entries), in order to
+reflect collections of related statistics of an entity, to allow for coherence
+across statistics of an entity when being read by users, and in order not to
+"pollute" debugfs.
+
+The statistics infrastructure provides multiple "views" on the same data or
+statistic respectively (called "definition", or settings in this document),
+or to be precise, multiple ways to process that data at run-time.
+The way data processing is done impact impacts the output format.
+The current implementation only allows to apply a one
+"view" / definition to a statistic to any point in time.
+
+ statistics infrastructure
+ |
+ +----- statistic interface
+ | |
+ | +----- statistic
+ | | (comprising definition and data)
+ | |
+ | +----- statistic
+ | |
+ | :
+ |
+ |
+ +----- statistic interface
+ | |
+ | +----- statistic
+ | |
+ : :
+
+
+Statistics are off by default. Starting and stopping is independent from
+resetting a statistic's data.
+
+
+
+
+ 2. Advantages of a generic approach
+
+Apart from a cleaner design approach, other advantages include:
+
+ Reduced development and test effort
+
+Once the generic layer was almost finished (using a test module),
+implementation and test of an assortment of statistics for the zfcp
+driver took me about 2 days. I spend more time evaluating which
+statistics make sense and which don't than coding the device driver
+extension.
+
+This makes it also cheap to put private patches for performance
+debugging purposes together which can be thrown away once the
+performance problem is understood.
+
+By not putting the statistics code into the device driver,
+I separated about 90 percent of it out into a reusable layer, which
+has seen some testing that the next exploiter can rely on.
+
+
+ Common user interface
+
+There is no point in coming up with another user interface and
+new algorithms for data processing for every device driver that
+is going to implement statistics.
+
+Once a user has learned to read and adjust statistics of component A
+they will have little or no problems applying this knowledge to
+statistics of component B.
+
+Besides, I can imagine a single script that would suffice to generate
+input for the spreadsheet program of your choice from statistics output
+of any component.
+
+
+ Lots of features
+
+The generic layer provides a superset of features of the
+"reinventing the wheel"-statistics I am aware of. This includes many
+ways data processing can be done. It's all there for everybody.
+
+It also means plenty of flexibility from a users perspective, because
+there are enough knobs available that make it unnecessary for me to
+respond to requests like "Hey, could you rebuild the zfcp module
+for me, because I need a finer resolution for that latency histogram?"
+
+
+
+
+ 3. Disadvantages of a generic approach
+
+Well, disadvantages center about performance. Some people might feel strong
+about it, while others don't have any issues with it.
+I am listing here what has come to my mind, anyway.
+
+While the code has been written to be as concise as possible,
+some programmers might feel the statistics infrastructure are
+overkill for their purposes. For example, if they think they can do
+with a simple atomic_t, they might frown upon statistic_add().
+
+The statistic infrastructure doesn't make use of in-line assemblies, which might
+be faster than the architecture independent C code. Maybe some programmers would
+rather opt for some small in-line assembly for their own specific purposes.
+
+Because of the flexibility as to the various knobs available for adjusting
+statistics (ranges, resolution or precision, mode of data processing, ...)
+it is up to the statistics infrastructure to take some decisions at run time
+(if this setting do that data processing), which, therewith, can't be resolved
+and optimized by the compiler. Flexibility and functionality are achieved on
+the expense of slightly increased path length.
+
+
+
+
+ 4. Features
+
+ Modes of data processing
+
+The simplest type of statistic is called "value" that provides either the total
+of Y-values, or the total of X*Y products using a simple counter.
+
+Another type that does require little memory is called "range" and comprises a
+set of four values: number, minimum, average, and maximum. It is a nice fill
+level indicator.
+
+Then, there are two types of histograms, one where Y-values are summed up for
+each X-value observed (called "list" because it utilizes an adaptable
+list_head list), and another one where Y-values are added up for ranges or
+intervals of X-values (called "array" because it utilizes a fixed-size array).
+"array"-type statistics can have either a linear or logarithmic X-scale.
+
+The statistic type "history" enhances other basic types ("value" and "range")
+by the dimension time. It allows to generate statistic results for successive
+periods of time, and it provides a history of these results. For example,
+transfer rate over time, like bytes/second, can be shown this way for a
+(request size, occurrence)-like data feed.
+
+Finally, there is a type called "raw" which simply stores timestamped (X, Y)
+pairs in a ring-buffer.
+
+Details on each type and the associated algorithms can be found in the
+kernel-doc-style comments of the programming interface.
+
+The statistics infrastructure has been designed to allow for easy addition of
+more types.
+
+
+ Other things that can be adjusted
+
+Ranges of acceptable X-values of (X, Y) pairs can be adjusted to the users
+needs. For example, a certain range of latencies could be of interest for a
+(request latency, occurrence)-like data source.
+
+Data can be discarded by users by resetting the statistic.
+
+Data gathering can be turned on and off for each statistic individually, or
+in an atomic fashion for all statistics attached to an interface.
+
+
+
+
+ 5. User interface
+
+ Locating statistics
+
+The statistics infrastructure's user interface is in the
+/sys/kernel/debug/statistics directory, assuming debugfs has been mounted at
+/sys/kernel/debug. The "statistics" directory holds interface subdirectories
+created by exploiters, for example:
+
+ drwxr-xr-x 2 root root 0 Jul 28 02:16 zfcp-0.0.50d4
+
+An interface subdirectory contains two files, a data and a definition file:
+
+ -r-------- 1 root root 0 Jul 28 02:16 data
+ -rw------- 1 root root 0 Jul 28 02:16 definition
+
+The data file holds the statistic data collected in ASCII format. Each
+line contains one entry of a statistic, tagged with the statistic name,
+for example:
+
+ request_sizes_scsi_write 0x1000 34
+
+You can influence the output in the data file by setting attributes in
+the definition file.
+
+
+ The "definition" file
+
+The statistics infrastructure processes reported data according to the
+settings in the definition file, particularly the type attribute. You
+can change some statistic attributes and thereby change how data is
+processed.
+
+The definition file contains the name, type, mode and other definitions
+for the different statistics for which data are collected. For example,
+a entry for a statistic in a definition file might look like this:
+
+ # cat definition
+ name=util_qdio_outb on=0 type=history range_min=0 range_max=127
+ entries_max=128 mode=range period=1000000 hits_out_of_range=0
+ data=[2417020.832398] started=[2417044.149675] stopped=[2417020.832398]
+ units=slots-occupied/incidents
+
+Some of the attributes shown are common to all statistics, others only apply
+to specific statistic type or mode of data processing respectively. Many
+attributes can be changed by users, others are read-only. All timestamps
+are in the style of printk-timestamps.
+
+
+ Common statistic attributes
+
+ Attribute Changeable Comment
+
+ name No The device driver provides the name that defines
+ a statistic.
+
+ units No Units defines what the device driver reports
+ as (X, Y) pair.
+
+ type Yes The attribute determines the way sampled data
+ is processed and displayed. Valid values are:
+ value, range, array, list, history, raw
+
+ data Yes The age of sampled data, that is, the time
+ since last reset.
+
+ on Yes The on or off state. Valid assignments are
+ 0 (off) and 1 (on).
+
+ started No The last time the statistic was started.
+ Depends on the on attribute.
+
+ stopped No The last time the statistic was stopped.
+ Depends on the on attribute.
+
+ range_min Yes The minimum value of the range you are
+ interested in.
+
+ range_max Yes The maximum value of the range you are
+ interested in.
+
+ hits_out_of_range No Values smaller than range_min and larger than
+ range_max are counted here.
+
+
+ Attributes specific to particular statistic types
+
+ Attribute Specific to Changeable Comment
+
+ entries_max type=history Yes Maximum number of entries in a
+ type=list ring buffer or list.
+ type=raw
+
+ hits_missed type=list No Number of (X, Y) pairs missed
+ due to a list-size limit
+ imposed by entries_max.
+
+ scale type=array Yes Valid values are:
+ lin for linear scale,
+ log2 for logarithmic scale.
+
+ base_interval type=array Yes base interval used for X-scale
+
+ mode=increments Yes Mode determines that only the
+ type=history total of Y-value is to be
+ type=value Yes calculated.
+
+ mode=products type=history Yes Mode determines that the total
+ type=value Yes of X*Y products is to be
+ calculated.
+
+ mode=range type=history Yes Mode determines that a set of
+ (number, min, average, max) is
+ to be calculated for each
+ period shown in the history
+ ring-buffer.
+
+ period type=history Yes Determines the time to elapse
+ before calculation is started
+ over.
+
+
+ Changing attributes through the "definition" file
+
+A statistic can be reconfigured by writing changed attributes to the
+definition-file:
+
+ echo name=foobar type=list entries_max=256 > definition
+
+You can either write the entire line describing a statistic, including
+read-only attributes (which are ignored by the statistics infrastructure, as
+any other junk is). This simplifies the procedures (copy, paste to command line,
+modify command line, echo attributes from command line into definition file)
+and (cat to file, modify file content, cat file back into definition file).
+
+In many cases, there is no need to write all attributes of a statistic back.
+Many attributes can be changed by themselves or when combined with other
+independent attributes:
+
+ echo name=foo range_min=3 > definition
+
+ echo name=foo range_max=17 > definition
+
+ echo name=bar on=1 > definition
+
+Only if you change the type of a statistic you are required to provide all
+type-specific attributes, because the type-specific attributes of the old type
+do not apply anymore:
+
+ echo name=foobar type=history mode=range period=1000000 entries_max=1024
+ > definition
+
+Naturally, changes of the type of a statistic or of type-specific attributes
+causes all data to be discarded and calculations being started over.
+
+Some operations can be done in an atomic fashion for all statistics grouped
+within the scope of an interface. Simply, omit the name= attribute:
+
+ echo on=1 > definition
+
+ echo on=0 > definition
+
+ echo data=reset > definition
+
+
+ Reading statistic output from the "data" file
+
+The "data" file contains the output of all statistics available for a
+particular interface. This is an ASCII file. Depending on the type of a
+statistic, the output for a statistic consists of a single line or a bunch
+of lines. Each line delivers one value or one result of a statistic and
+consists of several strings separated by spaces. The beginning of each line
+is tagged with the name of the statistic the line belongs to. The rest of
+a line is statistic-type specific. The content of a "data" might look like
+this:
+
+ foo 0x1000 4
+ foo 0x2000 1
+ foo 0x5000 2
+ bar 961 1 42.000 128
+
+
+ Output formats of different statistic types
+
+ Statistic Type Output Format Number of Lines
+
+ value <name> <total> 1
+
+ range <name> <total> <min> <avg> <max> 1
+
+ list <name> <Xn> <total for Xn> <= entries_max
+ ...
+
+ array <name> "<="<Xn> <total for interval> number of
+ ... intervals as
+ <name> ">"<Xm> <total for interval> determined by
+ base_interval,
+ scale,
+ range_min,
+ range_max
+
+ history <name> <time-stamp> <total> <= entries_max
+ (mode=increments, ...
+ mode=products)
+
+ history <name> <time-stamp> <total> <min> <avg> <max>
+ (mode=range) ... <= entries_max
+
+ raw <name> <time-stamp> <serial> <X> <Y> <= entries_max
+ ...
+
+ Sample Output
+
+This is output from per-adapter statistics of the zfcp SCSI HBA driver.
+It shows data for 5 statistics:
+
+ occurrence_qdio_outb_full 57
+ util_qdio_outb [3097301.211992] 0 0 0.000 0
+ ...
+ util_qdio_outb [3097394.211992] 865 1 1.052 5
+ util_qdio_outb [3097395.211992] 737 1 4.558 125
+ util_qdio_outb [3097396.211992] 396 1 11.765 77
+ util_qdio_outb [3097397.211992] 270 1 12.863 128
+ util_qdio_outb [3097398.211992] 765 1 7.271 26
+ util_qdio_outb [3097399.211992] 577 1 4.036 27
+ util_qdio_inb 118706 1 1.017 31
+ occurrence_low_mem_scsi 0
+ occurrence_erp 0
+
+In the order of appearance there are:
+
+ a value-type statistic providing a simple counter,
+ a history-of-ranges-type statistic providing a fill level indicator
+ over time (one entry per second),
+ a range-type statistic providing a fill level indicator,
+ two more value-type statistics.
+
+Here is more data for the same device driver (data per LUN):
+
+ request_sizes_scsi_write 0x64000 13
+ ...
+ request_sizes_scsi_write 0x67000 9
+ request_sizes_scsi_write 0x73000 8
+ ...
+ request_sizes_scsi_write 0x7b000 5
+ request_sizes_scsi_read 0x1000 44163
+ request_sizes_scsi_read 0x2000 9281
+ request_sizes_scsi_read 0x3000 4860
+ request_sizes_scsi_read 0x8000 4682
+ ...
+ request_sizes_scsi_read 0x4000 2976
+ request_sizes_scsi_nodata 0
+ latencies_scsi_write <=0 0
+ latencies_scsi_write <=1 0
+ latencies_scsi_write <=2 0
+ latencies_scsi_write <=4 174
+ latencies_scsi_write <=8 872
+ latencies_scsi_write <=16 2555
+ latencies_scsi_write <=32 2483
+ ...
+ latencies_scsi_write <=1024 1872
+ latencies_scsi_write >1024 1637
+ latencies_scsi_read <=0 0
+ latencies_scsi_read <=1 0
+ latencies_scsi_read <=2 0
+ latencies_scsi_read <=4 57265
+ latencies_scsi_read <=8 13610
+ latencies_scsi_read <=16 1082
+ latencies_scsi_read <=32 319
+ latencies_scsi_read <=64 63
+ ...
+ latencies_scsi_read >1024 0
+ latencies_scsi_nodata <=0 0
+ ...
+ latencies_scsi_nodata >1024 0
+ pending_scsi_write 17121 1 28.502 32
+ pending_scsi_read 72348 3 3.002 5
+
+In the order of appearance there are:
+
+ two lists of request sizes and their occurrences (write, read),
+ a counter for commands without data transfer,
+ three array-type histograms with logarithmic scales for request latencies
+ in milliseconds (write, read, no data),
+ two range-type statistics indicating many SCSI commands are pending
+ concurrently (write, read).
+
+
+
+
+ 6. Programming interface
+
+The programming interface can be retrieved from the kernel-doc-style comments
+available for all interface functions. A programming example can be found in
+drivers/s390/scsi.
+
+Anyway, here is an introduction:
+
+First, call statistic_create_interface() in order to create an interface
+where several statistics for the concerned entity can be attached next by
+calling statistic_create(). It is recommended that you call
+statistic_define_*() for each statistic, as well, in order to provide some
+half-decent default settings for the newly created statistic.
+
+Usually, exploiters won't enable data gathering - by calling statistic_start() -
+unless data should be collected by default. In many cases, it might be better
+to leave this to users, who can start statistics any time they wish.
+
+Add statistic_add*() or statistic_inc*() calls where appropriate for
+reporting statistics data. Data to be reported through these functions has the
+form of (X, Y) as explained above.
+
+In order to remove statistics call statistic_remove().
+The function statistic_interface_remove() cleans up an entire interface with
+all statistics still attached.
+
+
+
+
+ 7. Possible future enhancements / known bugs
+
+The current implementation is a prototype. It is good enough for the current
+zfcp requirements, though. For general use more work might be required.
+
+There are several possible enhancements and optimizations documented
+at the head of lib/statistic.c There I keep track of bugs as well.
+
+
+
+
+ 8. Contact
+
+See MAINTAINERS file.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/