The Performance Metrics Inference Engine (pmie) is a tool that provides automated monitoring of, and reasoning about, system performance within the PCP (Performance Co-Pilot) framework.
The following major sections in this chapter are as follows:
“Introduction to pmie” provides an introduction to the concepts and design of pmie.
“Basic pmie Usage” describes the basic syntax and usage of pmie.
“Specification Language for pmie” discusses the complete pmie rule specification language.
“Some Real pmie Examples” provides an illustration by example, covering several common performance scenarios.
“Developing and Debugging pmie Rules” presents some tips and techniques for pmie rule development.
“Caveats and Notes on pmie” presents some important information on using pmie.
“Creating pmie Rules With pmrules” provides a brief description of how to use the pmrules GUI for creating pmie rules from parameterized templates.
Automated reasoning within Performance Co-Pilot is provided by the Performance Metrics Inference Engine, pmie, which is an applied artificial intelligence application.
The pmie tool accepts expressions describing adverse performance scenarios, and periodically evaluates these against streams of performance metric values from one or more sources. When an expression is found to be true, pmie is able to execute arbitrary actions to alert or notify the system administrator of the occurrence of an adverse performance scenario. These facilities are very general, and are designed to accommodate the automated execution of a mixture of generic and site-specific performance monitoring and control functions.
The stream of performance metrics to be evaluated may be from one or more hosts, or from one or more PCP archive logs. In the latter case, pmie may be used to retrospectively identify adverse performance conditions.
Using pmie, you can filter, interpret, and reason about the large volume of performance data made available by the Performance Metrics Collection Subsystem (PMCS) and delivered through the Performance Metrics Application Programming Interface (PMAPI).
Typical pmie uses would include the following:
Automated real-time monitoring of a host, a set of hosts, or client-server pairs of hosts to raise operational alarms when poor performance is detected in a production environment.
Nightly processing of archive logs to detect and report performance regressions, or quantify quality of service for service agreements or management reports, or produce advance warning of pending performance problems.
Strategic performance management, for example, detection of abnormal, but not chronic, system behavior to guide performance analysis, trend analysis, and capacity planning.
The pmie “expressions” are described in a language that supports a wide range of expressive power and operational flexibility, and includes the following operators and functions:
Generalized predicate-action pairs, where a predicate is a logical expression over the available performance metrics, and the action is arbitrary. Pre-defined actions include
launch a visible alarm with xconfirm; see xconfirm(1)
post an entry to the system log /var/adm/SYSLOG; see syslog(3C)
post an entry to the PCP noticeboard file /var/adm/pcplog/NOTICES
execute a shell command or script, for example, to send e-mail, initiate a pager call, warn the “help” desk, and so on.
echo a message on standard output; most useful for scripts that generate reports from retrospective processing of PCP archive logs
Arithmetic and logical expressions in a C-like syntax.
Expression groups may have independent evaluation frequency, to support both short-term and long-term monitoring.
Canonical scale and rate conversion of performance metric values to provide sensible expression evaluation.
Aggregation functions of sum, avg, min, and max, that may be applied to collections of performance metrics values clustered over multiple hosts, or multiple instances, or multiple consecutive samples in time.
Universal and existential quantification, to handle expressions of the form “for every....” and “at least one...”.
Percentile aggregation to handle statistical outliers, such as “for at least 80% of the last 20 samples, ...”
Macro processing to expedite repeated use of common sub-expressions or specification components.
Transparent operation against either live-feeds of performance metric values from pmcd on one or more hosts, or against PCP archive logs of previously accumulated performance metric values.
The power of pmie may be harnessed to automate the most common of the deterministic system management functions that are responses to changes in system performance. For example, disable a batch stream if the DBMS transaction commit response time at the ninetieth percentile goes over two seconds, or stop accepting “news” and send e-mail to the sysadmin alias if free space in the news file system falls below five percent.
Moreover, the power of pmie can be directed towards the exceptional and sporadic performance problems. For example, if a “network packet storm” is looming, enable IP header tracing for ten seconds, and send e-mail to advise that data has been collected and is awaiting analysis. Or, if production batch throughput falls below 50 jobs per hour, activate a pager to the systems administrator on duty.
Obviously pmie customization is required to produce meaningful filtering and actions in each production environment. The pmrules tool provides a convenient customization method, allowing the user to generate parameterized pmie rules for some of the more common performance scenarios.
This section presents and explains some basic examples of pmie usage. The pmie tool accepts the common PCP command-line arguments, as described in Chapter 3, “Common Conventions and Arguments.” In addition, pmie accepts the following command-line arguments:
| -d | Enable interactive debug mode. | |
| -v | Verbose mode: expression values are displayed. | |
| -V | Verbose mode: annotated expression values are displayed. | |
| -W | When-verbose mode: when a condition is true, the satisfying expression bindings are displayed. |
One of the most basic invocations of this tool is this form:
pmie filename |
In this form, the expressions to be evaluated are read from filename. In the absence of a given filename, expressions are read from standard input, usually your system keyboard.
Before you use pmie, familiarize yourself with some Performance Metrics Collection System (PMCS) basics. It is strongly recommended that you familiarize yourself with the concepts from the section “Conceptual Foundations”. The discussion in this section serves as a very brief review of these concepts.
The PMCS makes available hundreds of performance metrics that you can use when formulating expressions for pmie to evaluate. If you want to find out which metrics are currently available on your system, use this command:
pminfo |
Use the pminfo command-line arguments to find out more about a particular metric. For example, to fetch new metric values from host moomba, use the -f flag:
pminfo -f -h moomba irix.disk.dev.total |
This produces the following response:
irix.disk.dev.total inst [131329 or "dks1d1"] value 970853 inst [131330 or "dks1d2"] value 53581 inst [131331 or "dks1d3"] value 5353 inst [131332 or "dks1d4"] value 225 inst [131333 or "dks1d5"] value 9674 inst [131334 or "dks1d6"] value 14383 inst [131335 or "dks1d7"] value 5578 |
This reveals that on the host moomba, the metric irix.disk.dev.total has seven instances, one for each disk on the system. The instance names are dks1d1, dks1d2 and so on up to dks1d7.
Use the following command to request help text (specified with the -T flag) to provide more information about performance metrics:
pminfo -T irix.network.interface.in.packets |
The metadata associated with a performance metric is used by pmie to determine how the value should be interpreted. You can examine the descriptor that encodes the metadata by using the -d flag for pminfo, as shown in this command:
pminfo -d -h somehost irix.mem.freemem irix.kernel.percpu.syscall |
In response, you see output similar to this:
irix.mem.freemem
Data Type: 32-bit unsigned int InDom: PM_INDOM_NULL 0xffffffff
Semantics: instant Units: Kbyte
irix.kernel.percpu.syscall
Data Type: 32-bit unsigned int InDom: 1.1 0x400001
Semantics: counter Units: count
|
The following example directs the inference engine to evaluate and print values (specified with the -v flag) for a single performance metric (the simplest possible expression), in this case irix.disk.dev.total, collected from the local pmcd:
pmie -v iops = irix.disk.dev.total; Ctrl+D iops: ? ? iops: 14.4 0 iops: 25.9 0.112 iops: 12.2 0 iops: 12.3 64.1 iops: 8.594 52.17 iops: 2.001 71.64 |
On this system there are two disk spindles, and hence two values of the expression iops per sample. Notice that the values for the first sample are unknown (represented by the question marks [?] in the first line of output), because rates can be computed only when at least two samples are available. The subsequent samples are produced every ten seconds by default. The second sample reports that during the preceding ten seconds there was an average of 14.4 transfers per second on one disk and no transfers on the other disk.
Rates are computed using time-stamps delivered by the PMCS. Due to unavoidable inaccuracy in the actual sampling time (the sample interval is not exactly 10 seconds), you may see more decimal places in values than you expect. Notice, however, that these errors do not accumulate but cancel each other out over subsequent samples.
In the above example, the expression to be evaluated was entered on standard input (the keyboard), followed by the end-of-file character Ctrl+D. Usually it is more convenient to enter expressions into a file (for example, myrules) and ask pmie to read the file. Use this command syntax:
pmie -v myrules |
Please refer to the pmie(1) reference page for a complete description of pmie command line options.
This section illustrates more complex pmie expressions with a view to establishing the flavor of the specification language. The next section provides a complete description of the pmie specification language.
The arithmetic expression
(irix.disk.all.write / irix.disk.all.total) * 100; |
computes the percentage of write operations over the total number of disk transfers. The irix.disk.all metrics are singular, so this expression produces exactly one value per sample, independent of the number of disk devices.
![]() | Note: If there is no disk activity, irix.disk.all.total will be zero and pmie evaluates this expression to be “not a number.” When -v is used, any such values are displayed as question marks. |
The following logical expression has the value true or false for each disk:
irix.disk.dev.total > 10 && irix.disk.dev.write > irix.disk.dev.read; |
The value is true if the number of writes exceeds the number of reads, and if there is some reasonably significant disk activity (more than 10 transfers per second).
The previous examples did not specify any action to be performed in the event that an expression evaluates to true. The default action is to do nothing, other than report the value of the expression if the -v option was used. The following example demonstrates a simple action:
some_inst irix.disk.dev.total > 60 ->
print "[%i] high disk i/o ";
|
This prints a message to the standard output whenever the total number of transfers for some disk (some_inst) exceeds 60 transfers per second. The %i (instance) in the message is replaced with the name(s) of the disk(s) that caused the logical expression to be true.
Using pmie to evaluate the above expressions every 3 seconds, you see output similar to the following:
pmie -v -t 3sec
pct_wrt = (irix.disk.all.write / irix.disk.all.total) * 100;
busy_wrt = irix.disk.dev.total > 10 &&
irix.disk.dev.write > irix.disk.dev.read;
busy = some_inst irix.disk.dev.total > 60 ->
print "[%i] high disk i/o ";
Ctrl+D
pct_wrt: ?
busy_wrt: ? ?
busy: ?
pct_wrt: 18.43
busy_wrt: false false
busy: false
Mon Aug 5 14:56:08 1996: [dks0d2] high disk i/o
pct_wrt: 10.83
busy_wrt: false false
busy: true
pct_wrt: 19.85
busy_wrt: true false
busy: false
pct_wrt: ?
busy_wrt: false false
busy: false
Mon Aug 5 14:56:17 1996: [dks0d1] high disk i/o [dks0d2] high disk i/o
pct_wrt: 14.8
busy_wrt: false false
busy: true
|
The first sample contains unknowns, since all expressions depend on computing rates. Also notice that the expression pct_wrt may have an undefined value whenever all disks are idle, as the denominator of the expression is zero. If one or more disks is busy, the expression busy is true, and the message from the print in the action part of the rule appears (before the -v values).
This section describes the complete syntax of the pmie specification language, as well as macro facilities and the issue of sampling and evaluation frequency. The reader with a preference for “learning by example” may choose to skip this section and go straight to the examples in “Some Real pmie Examples”.
Complex expressions are built up recursively from simple elements:
Performance metric values are obtained from pmcd for real-time or live sources, otherwise from PCP archive logs.
Metric values may be combined using arithmetic operators to produce arithmetic expressions.
Arithmetic expressions may be compared using relational operators to produce logical expressions.
Logical expressions may be combined using Boolean operators, including very powerful quantifiers.
Aggregation operators may be used to compute summary expressions, for either arithmetic or logical operands.
The final logical expression may be used to initiate a sequence of actions.
The pmie rule specification language supports a number of basic syntactic elements.
All pmie expressions are composed of the following lexical elements:
| identifier | Begins with an alphabetic (either upper or lowercase), followed by zero or more letters chosen from the alphabetics, the numeric digits, and the special characters period (.) and underscore (_); for example, x, irix.disk.dev.total and my_stuff. As a special case, an arbitrary sequence of letters enclosed by apostrophes (') is also interpreted as an identifier; for example, 'vms$slow_response'. | |
| keyword | The aggregate operators, units and the predefined actions are represented by keywords; for example, some_inst, print, and hour. | |
| numeric constant |
| |
| string constants |
|
Within quotes of any sort, the backslash (/) may be used as an escape character; for example, "A \"gentle\" reminder".
Comments may be embedded anywhere in the source, in either of these forms:
| /* text */ | C-style comment, optionally spanning multiple lines, with no nesting of comments. | |
| // text | C++-style comment from here to the end of the line. |
When fully specified, expressions in pmie tend to be verbose and repetitious. The use of macros can reduce repetition and improve readability and modularity. Any statement of the form
identifier = "string"; |
associates the macro name identifier with the given string constant.
Any subsequent occurrence of
$identifier |
is replaced by the string most recently associated with a macro definition for identifier. For example, given the macro definition
disk = "irix.disk.all"; |
you can then use the syntax
pct_wrt = ($disk.write / $disk.total) * 100; |
![]() | Note: Macro expansion is performed before syntactic parsing, so macros may only be assigned constant string values. |
The inference engine converts all numeric values to canonical units (seconds for time, bytes for space, and events for count). To avoid surprises, you are encouraged to specify the units for numeric constants. If units are specified, they are checked for dimension compatibility against the metadata for the associated performance metrics.
The syntax for a units specification is a sequence of one or more of the following keywords separated by either white space or a slash (/, to denote “per”): byte, KByte, MByte, GByte, TByte, nsec, nanosecond, usec, microsecond, msec, millisecond, sec, second, min, minute, hour, count, Kcount, Mcount, Gcount, Tcount. Plural forms are also accepted.
The following are examples of units usage:
irix.disk.dev.blktotal > 1 Mbyte / second; irix.mem.freemem < 500 Kbyte; |
![]() | Note: If you do not specify the units for numeric constants, it is assumed that the constant is in the canonical units of seconds for time, bytes for space, and events for count, and the dimensionality of the constant is assumed to be correct. Thus in the expression irix.mem.freemem < 500, the 500 is interpreted as 500 bytes. |
The identifier name delta is reserved to denote the interval of time between consecutive evaluations of one or more expressions. Set delta as follows:
delta = number [units]; |
If present, units must be one of the time units described in the preceding section. If absent, units are assumed to be seconds. For example,
delta = 5 min; |
has the effect that any subsequent expressions (up to the next expression that assigns a value to delta) are scheduled for evaluation at a fixed frequency, once every five minutes.
The default value for delta may be specified using the -t command-line option, otherwise delta is initially set to be 10 seconds.
A Performance Metrics Name Space (PMNS) provides a means of naming performance metrics, for example, irix.disk.dev.read. The Performance Metrics Collection System (PMCS) allows an application to retrieve one or more values for a performance metric from a designated source (a collector host running pmcd, or a PCP archive log). To specify a single value for some performance metric requires the metric name to be associated with all three of the following:
a particular host (or source of metric values)
a particular instance (for metrics with multiple values)
a sample time
The permissible values for hosts are the range of valid hostnames as provided by the Internet naming conventions.
The names for instances are provided by the Performance Metrics Domain Agents (PDMA) for the instance domain associated with the chosen performance metric.
The sample time specification is defined as the set of natural numbers 0, 1, 2, and so on. A number refers to one of a sequence of sampling events, stretching back from the current sample 0 to its predecessor 1, whose predecessor was 2, and so on. This scheme is illustrated by the time line shown in Figure 6-1.
Each sample point is assumed to be separated from its predecessor by a constant amount of real time, the delta. The most recent sample point is always zero. The value of delta may vary from one expression to the next, but is fixed for each expression; for more information on the sampling interval, see “Setting Evaluation Frequency”.
For pmie, a metric expression is the name of a metric, optionally qualified by a host, instance and sample time specification. Special characters introduce the qualifiers, namely colon (:) for hosts, hash or pound sign (#) for instances, and at (@) for sample times. For example, the expression
irix.disk.dev.read :moomba #dks0d1 @1 |
refers to the previous value (@1) of the counter for the disk read operations associated with the disk instance #dks0d1 on the host moomba. In fact, this expression defines a point in the three-dimensional parameter space of {host} x {instance} x {sample time} as shown in Figure 6-2.
A metric expression may also identify sets of values corresponding to one-, two-, or three-dimensional slices of this space, according to the following rules:
A metric expression consists of a PCP metric name, followed by optional host specifications, followed by optional instance specifications, and finally, optional sample time specifications.
A host specification consists of one or more host names, each prefixed by a colon (:). For example: :indy :far.away.domain.com :localhost
A missing host specification implies the default pmie source of metrics, as defined by a -h option on the command line, or the first named archive in a -a option on the command line, or pmcd on the local host.
An instance specification consists of one or more instance names, each prefixed by a hash or pound(#) sign. For example: #ec0 #ec2
Recall that you can discover the instance names for a particular metric, using the pminfo command. See “pmie and the Performance Metrics Collection System”.
Within the pmie grammar, an instance name is an identifier, so if the instance name contains characters other than the alphanumerics, it is probably safest to enclose the instance name in single quotes; for example, #'/dev/root' #'/dev/usr'
A missing instance specification implies all instances for the associated performance metric from each associated pmie source of metrics.
A sample time specification consists of either a single time or a range of times. A single time is represented as an at (@) followed by a natural number. A range of times is encoded as an at (@) followed by a natural number, followed by two periods (..) followed by a second natural number. The ordering of the end points in a range is immaterial. For example, @0..9 specifies the last 10 sample times.
A missing sample time specification implies the most recent sample time.
Putting all of this together, the metric expression
irix.disk.dev.read :foo :bar @0..4 |
refers to a three-dimensional set of values, with two hosts in one dimension, five sample times in another, and the number of instances in the third dimension being determined by the number of configured disk spindles on the two hosts.
Many of the metrics delivered by the PMCS are cumulative counters. Consider, for example, the following metric:
irix.disk.all.total |
A single value for this metric tells you only that a certain number of disk I/O operations have occurred since boot time, and that information may be invalid if the counter has exceeded its 32-bit range and “wrapped.” You need at least two values, sampled at known times, to compute the recent rate at which the I/O operations are being executed. The required syntax would be this:
(irix.disk.all.total @0 - irix.disk.all.total @1) / delta |
However, as well as being too verbose, the accuracy of delta as a measure of actual inter-sample delay is an issue here. pmie requests samples, at intervals of approximately delta; the results exported from the PMCS are time stamped with the high-resolution system clock at the time the samples were exported. For these reasons, a built-in and implicit rate conversion using accurate time stamps is provided by pmie for performance metrics that have “counter” semantics. That is, the expression
irix.disk.all.total |
is unconditionally converted to a rate by pmie.
Within pmie simple arithmetic expressions are constructed from metric expressions (see “pmie Metric Expressions”) and numeric constants, using all of the arithmetic operators and precedence rules of the C programming language.
All pmie arithmetic is performed in double precision.
The section “pmie Intrinsic Operators” describes additional operators that may be used for aggregate operations to reduce the dimensionality of an arithmetic expression.
A number of logical expression types are supported, described in the following sections:
logical constants
relational expressions
boolean expressions
quantification operators
Like C, pmie interprets an arithmetic value of zero to be false, and all other arithmetic values are considered true.
Relational expressions are the simplest form of logical expression, in which values may be derived from arithmetic expressions using pmie relational operators. For example,
irix.disk.all.read > 50 count/sec |
is a relational expression that is true or false, depending on the aggregate total of disk read operations per second being greater than 50.
All of the relational logical operators and precedence rules of the C programming language are supported in pmie.
As described in “pmie Metric Expressions”, arithmetic expressions in pmie may assume set values. Hence the relational operators are also required to take as arguments constant, singleton, and set-valued expressions; the result has the same dimensionality as the operands. So, given the rule
hosts = ":gonzo";
intfs = "#ec0 #ec2";
all_intf = irix.network.interface.in.packets
$hosts $intfs @0..2 > 300 count/sec;
|
the execution of pmie may proceed as follows:
pmie -V uag.11
all_intf:
gonzo: [ec0] ? ? ?
gonzo: [ec2] ? ? ?
all_intf:
gonzo: [ec0] false ? ?
gonzo: [ec2] false ? ?
all_intf:
gonzo: [ec0] true false ?
gonzo: [ec2] false false ?
all_intf:
gonzo: [ec0] true true false
gonzo: [ec2] false false false
|
At each sample, the relational operator “>” produces six truth values for the cross-product of the instance and sample time dimensions.
The section “Quantification Operators” describes additional logical operators that may be used to reduce the dimensionality of a relational expression.
The regular Boolean operators from the C programming language are supported, namely conjunction (&&), disjunction (||) and negation (!).
As with the relational operators, the Boolean operators accommodate set-valued operands, and set-valued results.
Boolean and relational operators may accept set-valued operands and produce set-valued results. In many cases, rules that are appropriate for performance management require a set of truth values to be reduced along one or more of the dimensions of hosts, instances, and sample times described in the section “pmie Metric Expressions”. The pmie quantification operators perform this function.
Each quantification operator takes a one-, two-, or three-dimensioned set of truth values as an operand, and reduces it to a set of smaller dimension, by quantification along a single dimension. For example, if the expression in the previous example is simplified and prefixed by some_sample, to produce
intfs = "#ec0 #ec2";
all_intf = some_sample irix.network.interface.in.packets
$intfs @0..2 > 300 count/sec;
|
then the expression result is reduced from six values to two (one per interface instance), such that the result for a particular instance will be false unless the relational expression for the same interface instance is true for at least one of the preceding three sample times.
There are existential, universal, and percentile quantification operators in each of the host, instance, and sample time dimensions to produce the nine operators as follows:
| some_host | True if the expression is true for at least one host for the same instance and sample time. | |
| all_host | True if the expression is true for every host for the same instance and sample time. | |
| N%_host | True if the expression is true for at least N% of the hosts for the same instance and sample time. | |
| some_inst | True if the expression is true for at least one instance for the same host and sample time. | |
| all_instance | True if the expression is true for every instance for the same host and sample time. | |
| N%_instance | True if the expression is true for at least N% of the instances for the same host and sample time. | |
| some_sample time |
| |
| all_sample time |
| |
| N%_sample time |
|
These operators may be nested. For example, the expression
Servers = ":moomba :babylon";
all_host (
20%_inst irix.disk.dev.read $Servers > 40 ||
20%_inst irix.disk.dev.write $Servers > 40
);
|
answers the question: “Are all hosts experiencing at least 20% of their disks busy either reading or writing?”
The following expression uses different syntax to encode the same semantics:
all_host (
20%_inst (
irix.disk.dev.read $Servers > 40 ||
irix.disk.dev.write $Servers > 40
)
);
|
![]() | Note: To avoid confusion over precedence and scope for the quantification operators, the use of explicit parenthesis is encouraged. |
Rule expressions for pmie have the following syntax:
lexpr -> actions ; |
The semantics are as follows:
If the logical expression lexpr evaluates true, then perform the actions that follow. Otherwise, do not perform the actions.
It is required that lexpr has a singular truth value; that is, aggregation and quantification operators have been applied to reduce multiple truth values to a single value.
When executed, an action completes with a success/failure status.
One or more actions may appear; consecutive actions are separated by operators that control the execution of subsequent actions, as follows:
action-1 & ... Always execute subsequent actions (serial execution).
action-1 | ... If action-1 fails, execute subsequent actions, otherwise skip the subsequent actions (alternation).
An action is composed of a keyword to identify the action method, an optional time specification, and one or more arguments.
A time specification uses the same syntax as a valid time interval that may be assigned to delta, as described in “Setting Evaluation Frequency”. If the action is executed and the time specification is present, pmie will suppress any subsequent execution of this action until the wall clock time has advanced by time.
The arguments are passed directly to the action method.
The following action methods are provided:
| shell | The single argument is passed to the shell for execution. This action is implemented using system in the background. The action does not wait for the system call to return, and succeeds unless the fork fails. | |
| alarm | A notifier containing a time stamp, a single argument as a message, and a Cancel button is posted on the current display screen (as identified by the DISPLAY environment variable). Each alarm action first checks if its notifier is already active. If there is an identical active notifier, a duplicate notifier is not posted. The action succeeds unless the fork fails. | |
| syslog | A message is written into the system log, using logger. An optional first argument is interpreted as a priority (see the -p option for logger); the message “tag” is pcp-pmie. The remaining argument is the message to be written to the system log. The action succeeds unless the fork fails. | |
A message containing a time stamp in ctime format and the argument is displayed out to standard output (stdout). This action always succeeds. |
Within the argument passed to an action method, the following expansions are supported to allow some of the context from the logical expression on the left to appear to be embedded in the argument:
| %h | The value of a host that makes the expression true. | |
| %i | The value of an instance that makes the expression true. | |
| %v | The value of a performance metric from the logical expression. |
Clearly, some ambiguity may occur in respect of which host, instance, or a performance metric is bound to a %-token. In most cases, the leftmost binding in the top-level subexpression is used, and this accommodates the vast majority of common rules. You may need to use pmie in the interactive debugging mode (specify the -d command line option) in conjunction with the -W command-line option to discover which subexpression contribute to the %-token bindings.
The following example illustrates some of the options when constructing rule expressions:
some_inst ( irix.disk.dev.total > 60 )
-> syslog 10 mins "[%i] busy, %v IOPS " &
shell 1 hour "echo \
'Disk %i is REALLY busy. Running at %v I/Os per second' \
| Mail -s 'pmie alarm' sysadm";
|
In this case, %v and %i are both associated with the instances for the metric irix.disk.dev.total that make the expression true. If more than one instance makes the expression true (that is, more than one disk is busy), then the argument is formed by concatenating the result from each %-token binding. For example, the text added to /var/adm/SYSLOG might be
Aug 6 08:12:44 5B:gonzo pcp-pmie[3371]:
[dks0d1] busy, 3.7 IOPS [dks0d2] busy, 0.3 IOPS
|
![]() | Note: When pmie is processing performance metrics from a PCP archive log, the actions will be processed in the expected manner; however, the action methods are modified to report a textual facsimile of the action on the standard output. For example, consider the following rule: |
delta = 2 sec; // more often for demonstration purposes
percpu = "irix.kernel.percpu";
// Unusual usr-sys split when some CPU is more than 20% in usr mode
// and sys mode is at least 1.5 times usr mode
//
cpu_usr_sys = some_inst (
$percpu.cpu.sys > $percpu.cpu.user * 1.5 &&
$percpu.cpu.user > 0.2
) -> alarm "Unusual sys time: " "%i ";
|
When evaluated against an archive the following output is generated (the alarm action produces a message on standard output):
pmafm /tmp/f4 pmie cpu.head cpu.00 alarm Wed Aug 7 14:54:48 1996: Unusual sys time: cpu0 alarm Wed Aug 7 14:54:50 1996: Unusual sys time: cpu0 alarm Wed Aug 7 14:54:52 1996: Unusual sys time: cpu0 alarm Wed Aug 7 14:55:02 1996: Unusual sys time: cpu0 alarm Wed Aug 7 14:55:06 1996: Unusual sys time: cpu0 |
The following sections describe some further useful intrinsic operators for pmie:
arithmetic aggregation
the rate operator
transitional operators
For set-valued arithmetic expressions, the following operators reduce the dimensionality of the result by arithmetic aggregation along one of the host, instance, or sample time dimensions. For example, to aggregate in the host dimension, the following operators are provided:
| avg_host | Compute the average value across all instances for the same host and sample time. | |
| sum_host | Compute the total value across all instances for the same host and sample time. | |
| count_host | Compute the number of values across all instances for the same host and sample time. | |
| min_host | Determine the minimum value across all instances for the same host and sample time. | |
| max_host | Determine the maximum value across all instances for the same host and sample time. |
And there are ten further operators corresponding to the forms *_inst and *_sample.
The following example illustrates the use of an aggregate operator in combination with a existential operator to answer the question “Does some host currently have two or more busy processors?”
// note '' to escape - in host name
poke = ":moomba :'mac-larry' :bitbucket";
some_host (
count_inst ( irix.kernel.percpu.cpu.user $poke +
irix.kernel.percpu.cpu.sys $poke > 0.7 ) >= 2
)
-> alarm "2 or more busy CPUs";
|
The rate operator computes the rate of change of an arithmetic expression. For example:
rate irix.mem.freemem |
returns the rate of change for the irix.mem.freemem performance metric; that is, the rate at which free physical memory is being allocated or released.
The rate intrinsic operator is most useful for metrics with “instantaneous” value semantics. For metrics with “counter” semantics, pmie already performs an implicit rate calculation (see the“pmie Rate Conversion”) and the rate operator would produce the second derivative with respect to time, which is less likely to be useful.
In some cases, an action needs to be triggered when an expression changes from true to false or vice versa. The following operators take a logical expression as an operand, and return a logical expression:
| rising | Has the value true when the operand transitions from false to true in consecutive samples. | |
| falling | Has the value false when the operand transitions from true to false in consecutive samples. |
The examples presented in this section are task-oriented and use the full power of the pmie specification language as described in “Specification Language for pmie”.
Source code for the pmie examples in this chapter, and many more, is provided in the PCP subsystem pcp.sw.demo, and when installed may be found in /var/pcp/demos/pmie. The following examples are given here:
monitoring CPU utilization
monitoring disk activity
// Some Common Performance Monitoring Scenarios
//
// The CPU Group
//
delta = 2 sec; // more often for demonstration purposes
// common prefixes
//
percpu = "irix.kernel.percpu";
all = "irix.kernel.all";
// Unusual usr-sys split when some CPU is more than 20% in usr mode
// and sys mode is at least 1.5 times usr mode
//
cpu_usr_sys =
some_inst (
$percpu.cpu.sys > $percpu.cpu.user * 1.5 &&
$percpu.cpu.user > 0.2
)
-> alarm "Unusual sys time: " "%i ";
// Over all CPUs, syscall_rate > 1000 * no_of_cpus
//
cpu_syscall =
$all.syscall > 1000 count/sec * hinv.ncpu
-> print "high aggregate syscalls: %v";
// Sustained high syscall rate on a single CPU
//
delta = 30 sec;
percpu_syscall =
some_inst (
$percpu.syscall > 2000 count/sec
)
-> syslog "Sustained syscalls per second? " "[%i] %v ";
// the 1 minute load average exceeds 5 * number of CPUs on any host
hosts = ":gonzo :moomba"; // change as required
delta = 1 minute; // no need to evaluate more often than this
high_load =
some_host (
$all.load $hosts #'1 minute' > 5 * hinv.ncpu
)
-> alarm "High Load Average? " "%h: %v ";
|
// Some Common Performance Monitoring Scenarios
//
// The Disk Group
//
delta = 15 sec; // often enough for disks?
// common prefixes
//
disk = "irix.disk";
// Any disk performing more than 40 I/Os per second, sustained over
// at least 30 seconds is probably busy
//
delta = 30 seconds;
disk_busy =
some_inst (
$disk.dev.total > 40 count/sec
)
-> shell "Mail -s 'Heavy systained disk traffic' sysadm";
// Try and catch bursts of activity ... more than 60 I/Os per second
// for at least 25% of 8 consecutive 3 second samples
//
delta = 3 sec;
disk_burst =
some_inst (
25%_sample (
$disk.dev.total @0..7 > 60 count/sec
)
)
-> alarm "Disk Burst? " "%i ";
// any SCSI disk controller performing more than 3 Mbytes per
// second is busy
// Note: the obscure 512 is to convert blocks/sec to byte/sec,
// and pmie handles the rest of the scale conversion
//
some_inst $disk.ctl.blktotal * 512 > 3 Mbyte/sec
-> alarm "Busy Disk Controller: " "%i ";
|
Given the -d command-line option, pmie executes in interactive mode, and the user is presented with a menu of options like this:
pmie debugger commands f [file-name] - load expressions from given file or stdin l [expr-name] - list named expression or all expressions r [interval] - run for given or default interval S time-spec - set start time for run T time-spec - set default interval for run command v [expr-name] - print subexpression for %h, %i and %v bindings h or ? - print this menu of commands q - quit pmie? |
If both the -d option and a filename are present, the expressions in the given file are loaded before entering interactive mode. Interactive mode is useful for debugging new rules.
The following sections provide important information for users of pmie.
Performance metrics that are cumulative counters may occasionally overflow their range and wrap around to 0. When this happens, an unknown value (printed as ?) is returned as the value of the metric for one sample (recall that the value returned is normally a rate). You can have PCP interpolate a value based on expected rate of change by setting the PCP_COUNTER_WRAP environment variable.
The sample interval (delta) should always be long enough, particularly in the case of rates, to ensure that a meaningful value is computed. Interval may vary according to the metric and your needs. A reasonable minimum is in the range of ten seconds or several minutes. Although the PMCS supports sampling rates as fast as hundreds of times per second, using small sample intervals creates unnecessary load on the monitored system.
When you specify a metric instance name (#identifier) in a pmie expression, it is compared against the instance name supplied by the PMCS as follows:
If the given instance name and the PMCS name are the same, they are considered to match.
Otherwise the first two white-space separated tokens are extracted from the PMCS name. If the given instance name is the same as either of these tokens, they are considered to match.
For some metrics, notably the per process (proc.xxx.xxx) metrics, the first token in the PMCS instance name is impossible to determine at the time you are writing pmie expressions. The above policy circumvents this problem.
The parser used in pmie is currently not robust in the face of syntax errors. It is suggested that you check any problem expressions individually in interactive mode:
pmie -v -d pmie? f expression Ctrl+D |
If the expression was parsed, its internal representation is shown:
pmie? l |
The expression is evaluated twice and its value printed:
pmie? r 10sec |
Then quit:
pmie? q |
It is not always possible to detect semantic errors at parse time. This happens when a performance metric descriptor is not available from the named host at this time. A warning is issued, and the expression is put on a wait list. The wait list is checked periodically (about every five minutes) to see if the metric descriptor has become available. If an error is detected at this time, a message is printed to the standard error stream (stderr) and the offending expression is put aside.
The GUI tool pmrules may be used to generate pmie rules from templates that are shipped with PCP. These templates are parameterized versions of rules describing common performance scenarios suited for pmie monitoring.
Start pmrules, and choose Import... from the Template menu.
Click the Choose File... button in the “Import template(s) from file” dialog.
Sample templates are installed in the directory /var/pcp/config/pmrules.
Double-click the pcp directory in the pmrules directory browser window.
An “Import template(s) from file ” dialog appears, as shown in Figure 6-3.
Select the desired templates, click OK, and return to the pmrules main window, which appears similar to the one shown in Figure 6-4.
Double-click the desired template, and the pmrules Edit template dialog displays, similar to the one shown in Figure 6-5.
At this point you can customize the template by assigning values to the threshold, delta, and holdoff Parameters text boxes, then either selecting one of the predefined Actions, or specifying your own custom user action.
When you are finished customizing the template, click OK and return to the main pmrules window.
Choose Save As from the File menu, and provide a new name for your private copy of the pmrules template file.
Two files are saved. The first one takes the given filename and is your private copy of the pmrules template file. The second file takes the given filename with the suffix .pmie appended and contains the pmie rules—this second file should be given as an argument to pmie.
You can also create new templates for other performance problems. These can then be included in the template collection available to pmrules, and then used to customize instances of the pmie rules for particular hosts.
See the pmrules(1) reference page for a complete description of the capabilities of the pmrules tool.