Chapter 3. Common Conventions and Arguments

This chapter deals with the user interface components that are common to most of the graphical tools and text-based utilities that make up the monitor portion of Performance Co-Pilot. These are the major sections in this chapter:

Many of the utilities provided with Performance Co-Pilot (PCP) conform to a common set of naming and syntactic conventions for command-line arguments and options. This section outlines these conventions and their meaning. The options may be generally assumed to be honored for all utilities supporting the corresponding functionality.

In all cases, the reference pages for each utility fully describe the supported command arguments and options.

Command-line options are also relevant when starting PCP applications from the desktop using the Alt double-click method. This technique launches the pmrun program to collect additional arguments to pass along when starting a PCP application.

PerfTools Icon Catalog

The conventions and arguments described in this chapter are common to all tools and utilities in the PerfTools icon catalog group, shown in Figure 3-1.

Figure 3-1. PerfTools Icon Catalog Group


Alternate Metric Source Options

The default source of performance metrics is from pmcd on the localhost. This section describes how to obtain metrics from sources other than the default.

Fetching Metrics From Another Host

The option -h host directs any PCP utility (such as pmchart or dkvis) to make a connection with the pmcd instance running on host. Once established, this connection serves as the principal real-time source of performance metrics and metadata.

Fetching Metrics From an Archive Log

The option -a archive directs the utility to treat the PCP archive logs with base name archive as the principal source of performance metrics and metadata.

PCP archive logs are created with pmlogger. Most PCP utilities operate with equal facility for performance information coming from either a real-time feed via pmcd on some host, or for historical data from a PCP archive log. For more information on archive logs and their use, see Chapter 7, “Archive Logging.”

The base name (archive) of the PCP archive log used with the -a option implies the existence of the files created automatically by pmlogger, as listed in Table 3-1.

Table 3-1. Physical Filenames for Components of a PCP Archive Log

Filename

Contents

archive . index

Temporal index for rapid access to archive contents

archive . meta

Metadata descriptions for performance metrics and instance domains appearing in the archive

archive.N

Volumes of performance metrics values, for N = 0,1,2,...

Some tools are able to concurrently process multiple PCP archive logs (for example, for retrospective analysis of performance across multiple hosts), and accept either multiple -a options or a comma separated list of archive names following the -a option.


Note: The -h and -a options are mutually exclusive in all cases.


General PCP Tool Options

The following sections provide information relevant to most of the PCP tools. It is presented here in a single place for convenience.

Common Directories and File Locations

The following files and directories are used by the PCP tools as repositories for option and configuration files and for binaries:

/etc/pmcd.conf 

Configuration file for PMCD.

/usr/etc/pmcd 

The PMCD binary.

/etc/config/ 

The pmcd.options file contains command-line options for pmcd.
The pmlogger.options file contains command-line options for pmlogger launched from /etc/init.d/pcp.

/etc/init.d/pcp 

The PMCD startup script.

/usr/sbin 

Contains PCP Graphical Tools such as dkvis, nfsvis, and pmview.

/usr/pcp 

Shareable PCP-specific files and repository directories.

/var/pcp 

Non-shareable (that is, per-host) PCP specific files and repository directories. There are some symbolic links from the /usr/pcp directory hierarchy pointing into the /var/pcp directory hierarchy.

/usr/pcp/bin 

Contains PCP tools that are typically not executed directly by the end user such as pmbrand, pmnscomp, and pmlogger.

/usr/pcp/doc  

PCP WebMeter documentation shipped in HTML format, suitable for display with some World Wide Web browser.

/usr/pcp/lib 

Contains miscellaneous PCP libraries and executables.

/var/pcp/pmdas 

Contains Performance Metric Domain Agents, one directory per PMDA.

/usr/pcp/pmdas 

An alternate repository for some PMDAs. Certain entries here are symbolic links into /var/pcp/pmdas.

/var/pcp/config 

Contains configuration files for PCP tools, typically with one directory per tool.

/usr/pcp/demos 

Contains demonstration files and programs and the PCP Tutorial.

/var/adm/pcplog 

By default contains diagnostic and trace log files generated by pmcd and PMDAs. Also, the PCP archive logs are managed in one directory per logged host below here.

/var/pcp/pmns 

Contains files and scripts for the Performance Metrics Name Space.

Alternate Performance Metric Name Spaces

The Performance Metrics Name Space (PMNS) defines a mapping from a collection of external names for performance metrics (convenient to the user) into corresponding internal identifiers (convenient for the underlying implementation).

The distributed PMNS used in PCP 2.0 avoids most requirements for an alternate PMNS, because clients' PMNS operations are supported at the PMCD or by means of PMNS data in a PCP archive log. The distributed PMNS is the default, but alternates may be specified using the -n namespace argument to the PCP tools. When a PMNS is maintained on a host, it is likely to reside in the /var/pcp/pmns directory.

Refer to the pmns(4) and pmnscomp(1) reference pages for details of PMNS structure and creation.

Time Duration and Control

The periodic nature of sampling performance metrics and refreshing the displays of the PCP tools makes specification and control of the temporal domain a common operation. In the following sections, the services and conventions for specifying time positions and intervals are described.

Performance Monitor Reporting Frequency and Duration

Many of the performance monitoring utilities have periodic reporting patterns. The -t interval and -s samples options are used to control the sampling (reporting) interval, usually expressed as a real number of seconds (interval), and the number of samples to be reported, respectively. In the absence of the -s flag, the default behavior is for the performance monitoring utilities to run until they are explicitly stopped.

The interval argument may also be expressed in terms of minutes, hours, or days, as described in the PCPIntro(1) reference page.

Time Window Options

The following options may be used with most PCP tools (typically when the source of the performance metrics is a PCP archive log) to tailor the beginning and end points of a display, the sample origin, and the sample time alignment to your convenience.

The -S, -T, -O and -A command-line options are used by PCP applications to define a time window of interest.

-S duration  

The start option may be used to request that the display start at the nominated time. By default, the first sample of performance data is retrieved immediately in real-time mode, or coincides with the first sample of data in a PCP archive log in archive mode. For archive mode, the -S option may be used to specify a later time for the start of sampling. By default, if duration is an integer, the units are assumed to be seconds.

To specify an offset from the beginning of a PCP archive (in archive mode) simply specify the offset as the duration. For example

-S 30min 

retrieves the first sample of data at exactly 30 minutes from the beginning of a PCP archive.

To specify an offset from the end of a PCP archive, prefix the duration with a minus sign. In this case, the first sample time precedes the end of archived data by the given duration. For example

-S -1hour 

retrieves the first sample exactly one hour preceding the last sample in a PCP archive.

To specify the calendar date and time (local time in the reporting time zone) for the first sample, use the ctime syntax preceded by an “at” sign (@). For example,

-S '@ Mon Mar 4 13:07:47 1996' 

specifies the date and time to be used. Note that this format corresponds to the output format of the date command for easy “cut and paste.” However, be sure to enclose the string in quotes so it is preserved as a single argument for the PCP tool.

For more complete information on the date and time syntax, see the PCPIntro(1) reference page.

-T duration  

The terminate option may be used to request that the display stop at the time designated by duration. By default, the PCP tools keep sampling performance data indefinitely (in real-time mode) or until the end of a PCP archive (in archive mode). The -T option may be used to specify an earlier time to terminate sampling.

The interpretation for the duration argument in a -T option is the same as for the -S option, except for an unsigned time interval that is interpreted as being an offset from the start of the time window as defined by the default (now for realtime, else start of archive) or by a -S option. For example, these options define a time window that spans 45 minutes, after an initial offset (or delay) of 1 hour:

-S 1hour -T 45mins

-O duration 

By default, samples are fetched from the start time (see the description of the -S option) to the terminate time (see the description of the -T option). The offset -O option allows the specification of a time between the start time and the terminate time where the tool should position its initial sample time. This option is useful when initial attention is focused at some point within a larger time window of interest, or when one PCP tool wishes to launch another PCP tool with a common current point of time within a shared time window.

The duration argument accepted by -O conforms to the same syntax and semantics as the duration argument for -T. For example, these options specify that the initial position should be the end of the time window:

-O -0 

This is most useful with pmchart(1) to display the tail-end of the history up to the end of the time window.

-A alignment 

By default, performance data samples do not necessarily happen at any natural unit of measured time. The -A switch may be used to force the initial sample to be on the specified alignment. For example, these three options specify alignment on seconds, half hours, and whole hours:

-A 1sec 
-A 30min 
-A 1hour

The -A option advances the time to achieve the desired alignment as soon as possible after the start of the time window, whether this is the default window, or one specified with some combination of -S and -O command-line options.

Obviously the time window may be overspecified by using multiple options from the set -t, -s, -S, -T, -A, and -O. Similarly, the time window may shrink to nothing by injudicious choice of options.

In all cases, the parsing of these options applies heuristics guided by the principal of “least surprise”; the time window is always well-defined (with the end never earlier than the start), but may shrink to nothing in the extreme.

Time Zone Options

All utilities that report time of day use the local time zone by default.

-z  

This option forces times to be reported in the time zone of the host that provided the metric values (the PCP collector host). When used in conjunction with -a and multiple archives, the convention is to use the time zone from the first named archive.

-Z timezone  

This option may be used to set the TZ variable to a time zone string, as defined in environ(5), for example, -Z UTC for universal time.

PCP Live Time Control

The pmtime dialog is invoked through the PCP tools when you select the Show Time Control option from the Options menu of most PCP tools. The dialog may also be exposed by selecting the “time control state” button at the bottom left-hand corner of the pmchart display or the top left-hand corner of a 3D performance scene displayed with the pmview or oview tools. For more information on the “time control state” button, see the reference pages for pmview(1), pmchart(1), oview(1), or pmtime(1).

If the PCP tool is displaying performance metrics from a real-time source, the pmtime dialog looks similar to that shown in Figure 3-2.

This dialog can be used to set the sample interval and units; the latter may be in milliseconds, seconds, minutes, hours, days, or weeks.

To change the units, select the measurement of time you want from the option menu (labelled Seconds in Figure 3-2).

Figure 3-2. pmtime Live Time Control Dialog


To change the interval, enter the new value in the Interval text box, and press Enter. All PCP tools attached to the pmtime control dialog are notified of the new interval, and will update their displays immediately to reflect the new sampling rate.

Creating a PCP Archive

The ability to start and stop recording of performance activity is available from the pmchart, pmview, and oview windows. See “Creating a PCP Archive From a pmchart Session” for information about the pmchart interface.

Alternatively use pmlogger directly, as described in Chapter 7, “Archive Logging.”

PCP Archive Time Control

The ability to provide retrospective performance analysis in the PCP framework is provided by making the monitor tools able to deal interchangeably with real-time sources of performance metrics and PCP archive logs. For more information on archive logging, see Chapter 7.

When a PCP tool is displaying performance metrics from a PCP archive log, and the pmtime dialog is exposed, it looks similar to that shown in Figure 3-3.

Figure 3-3. pmtime Archive Time Control Dialog


As with the live pmtime dialog, the user may change the update interval; however, a number of other controls are available:

  • The VCR Controls option menu may be used to change the mode of time advance between Normal, Step, and Fast. In Normal mode, the time advances with the elapsed time per sample being equal to the current Interval (divided by Speed). In Step mode, each selection of one of the direction buttons advances the time by the current Interval. In Fast mode, the time advances by the Interval without any added delay.

  • The Speed text box and associated thumb-wheel may be used to make the rate of time advance in Normal mode either slower (Speed < 1) or faster (Speed > 1) than realtime.

  • The Position text box shows the current time within the PCP archive log. The Position may be changed either by advancing the time using the VCR Controls buttons (Play, Rewind, FastFwd, or Stop), or by modifying the Position text box (and pressing Enter), or by moving the slider below the Position text box.

  • The VCR motion buttons allow time to be advanced forward or backward, or stopped.

The menus of pmtime ArchiveTime Control provide the following additional features:

File Menu

Hide option 

Hide the dialog; the PCP tools provide their own menu options or time control icon that may be used to re-expose the pmtime dialog.

Options Menu

Timezone option 


Select an alternative time zone for all displayed dates and times; all PCP tools attached to the pmtime control are notified of the new time zone. Because the UTC time zone is universal, it is useful when several archives or live sources of data are being displayed in multiple instances of the tools, and comparisons between performance metrics are required to be temporally correlated. Whenever a new source of metrics is opened, whether an archive or live, the time zone at that source of metrics is added to the list in the option menu. The default time zone is that of the local host where the tool is being run.

Show Bounds... option 


Expose the Archive Time Bounds dialog, which shows the current time window that defines the earliest and latest time for which performance may be displayed from the current archives.

Figure 3-4. pmtime Archive Time Bounds Dialog


Detail option 

For output fields, selectively include or exclude the year in the date or milliseconds in time. The year is shown by default, milliseconds are not.

PCP Environment Variables

The following environment variables are recognized by PCP (these definitions are also available on the PCPIntro(1) reference page):

PCP_COUNTER_WRAP  


Many of the performance metrics exported from PCP agents expect that counters increase monotonically. Under some circumstances, one value of a metric may be smaller than the previously fetched value. This can happen when a counter of finite precision overflows, when the PCP agent has been reset or restarted, or when the PCP agent exports values from an underlying instrumentation that is subject to asynchronous discontinuity. If set, the environment variable PCP_COUNTER_WRAP indicates that all such cases of a decreasing counter should be treated as a counter overflow; and hence and hence the values are assumed to have wrapped once in the interval between consecutive samples. Counter wrapping was the default in versions before the PCP 1.3 release.

PCP_LICENSE_NOWARNING 


Many of the PMAPI client programs require that a valid software license be present on the host on which the client is running (the license is node-locked). In the case that such a valid license is present, but is due to expire within the next 30 days, a message or popup notifier appears informing the user of this condition. These warnings can be disabled by setting this variable in the environment.

PCP_LOGDIR 

Many PCP utilities create diagnostic and trace log files, and the default locations are below the directory /var/adm/pcplog. Setting the variable PCP_LOGDIR overrides the default directory. If PCP_LOGDIR is unset, the variable PCPLOGDIR is treated as an alias and used if set, to provide backwards compatibility with earlier PCP releases.

PCP_STDERR 

Specifies whether pmprintf() error messages are sent to standard error, an xconfirm dialog box, or to a named file; see pmprintf(3). Messages go to standard error if PCP_STDERR is unset or set without a value. If this variable is set to DISPLAY, then messages go to an xconfirm dialog box; see xconfirm(1). Otherwise the value of PCP_STDERR is assumed to be the name of an output file.

PCP_TRACE_HOST  


The pmdatrace library routines use this variable when connecting to the trace PMDA to determine on which host it is running; see pmdatrace(3).

PCP_TRACE_PORT  


This variable is used by both the trace PMDA and client programs using the pmdatrace library to obtain the Internet port through which the client programs and the PMDA communicate; see pmdatrace(3).

PCP_TRACE_TIMEOUT  


When pmdatrace client programs are connecting to the trace PMDA, this variable can be set to specify how long the clients should wait before cancelling their attempt to connect with the PMDA; see pmdatrace(3).

PMCD_CONNECT_TIMEOUT 


When attempting to connect to a remote pmcd on a system that is booting or at the other end of a slow network link, some PMAPI routines could potentially block for a long time until the remote system responds. These routines abort and return an error if the connection has not been established after some specified interval has elapsed. The default interval is 5 seconds. This may be modified by setting this variable in the environment to a larger number of seconds for the desired time out. This is most useful in cases where the remote host is at the end of a slow network, requiring longer latencies to establish the connection correctly.

PMCD_PORT 

This is the TCP/IP port used by pmcd to create the socket for incoming connections and requests. The default is port number 4321, which you may override by setting this variable to a different port number. If a non- default port is in effect when pmcd is started, then every monitoring application connecting to that pmcd must also have this variable set in their environment before attempting a connection.

PMCD_RECONNECT_TIMEOUT 


When a monitor or client application loses its connection to a pmcd, the connection may be reestablished by calling the pmReconnectContext() PMAPI function. However, attempts to reconnect are controlled by a back-off strategy to avoid flooding the network with reconnection requests. By default, the back-off delays are 5, 10, 20, 40, and 80 seconds for consecutive reconnection requests from a client (the last delay is repeated for any further attempts after the last delay in the list). Setting this environment variable to a comma-separated list of positive integers redefines the back-off delays. For example, setting the delays to “1,2” will back off for 1 second, then back off every 2 seconds thereafter.

PMCD_REQUEST_TIMEOUT 


For monitor or client applications connected to pmcd, there is a possibility of the application “hanging” on a request for performance metrics or metadata or help text. These delays may become severe if the system running pmcd crashes or the network connection is lost or the network link is very slow. By setting this environment variable to a real number of seconds, requests to pmcd timeout after the specified number of seconds. The default behavior is to wait 10 seconds for a response from every pmcd for all applications.

PMDA_LOCAL_PROC  


If set, then a context established with type PM_CONTEXT_LOCAL has access to the proc PMDA and can retrieve performance metrics about individual processes.

PMDA_LOCAL_SAMPLE  


If set, then a context established with type PM_CONTEXT_LOCAL has access to the sample PMDA if this optional PMDA has been installed locally.

PMDA_PATH 

This environment variable may be used to modify the search path used by pmcd and pmNewContext() (for PM_CONTEXT_LOCAL contexts) when searching for a daemon or DSO PMDA. The syntax follows the syntax for shell PATH: a colon-separated list of directories. The default search path is /var/pcp/lib:/usr/pcp/lib.

PM_LAUNCH_PATH  


A launching tool searches for its script in the directory specified by this variable, rather than /var/pcp/config/pmlaunch; see pmlaunch(5).

PMLOGGER_PORT 


This environment variable may be used to change the base TCP/IP port number used by pmlogger to create the socket to which pmlc instances try to connect. The default base port number is 4330. If used, this variable should be set in the environment before pmlogger is executed. If pmlc and pmlogger are on different hosts, then obviously PMLOGGER_PORT must be set to the same value in both places.

PMNS_DEFAULT  


If set, this value is interpreted as the full pathname to be used as the default PMNS for pmLoadNameSpace(). Otherwise, the default PMNS is located at /var/pcp/pmns/root for base PCP installations.

Running PCP Tools Through a Firewall

In some production environments, the PCP monitoring hosts are on one side of a TCP/IP firewall, and the PCP collector hosts may be on the other side.

If the firewall service is being provided by a product that supports the sockd (SOCKS) protocols for packet forwarding through the firewall, then the PCP tool pmsocks may be used; see pmsocks(1). Otherwise it is necessary to arrange for packet forwarding to be enabled for those TCP/IP ports used by PCP, namely 4321 (or the value of environment variable PMCD_PORT) for connections to pmcd and a finite range of consecutive port numbers starting at 4330 (or the value of environment variable PMLOGGER_PORT) to allow pmlc connections to pmlogger instances.

The pmsocks Command

The pmsocks command and its related files and scripts allow Performance Co-Pilot (PCP) clients running on hosts located on the internal side of a TCP/IP sockd firewall system to monitor remote hosts on the other side of the firewall system. The basic syntax is as follows, where tool is an arbitrary PCP application, typically a monitoring tool:

pmsocks tool args 

The pmsocks script prepares the necessary environment variables and then executes the PCP tool specified in tool across the firewall. For example, this command runs dkvis with metrics fetched from remotehost on the other side of the firewall:

pmsocks dkvis -h remotehost 

The configuration file is /etc/pcp_socks.conf, and this is the file where the network-specific information must be set to correspond with your network. Complete information on this customization can be found in the pmsocks(1) reference page.

Transient Problems With Performance Metric Values

Sometimes the values for a performance metric as reported by a PCP tool appear to be incorrect. This is typically caused by transient conditions such as metric wrap-around or time skew, described below. These conditions result from design decisions that are biased in favor of light-weight protocols and minimal resource demands for PCP components.

In all cases, these events are expected to occur infrequently, and should not persist beyond a few samples.

Performance Metric Wrap-Around

Performance metrics are usually expressed as numbers with finite precision. For metrics that are cumulative counters of events or resource consumption, the value of the metric may occasionally overflow the specified range and wrap around to zero.

Because the value of these counter metrics is computed from the rate of change with respect to the previous sample, this may result in a transient condition where the rate of change is an “unknown” value. If the PCP_COUNTER_WRAP environment variable is set, this condition is treated as an overflow, and speculative rate calculations are made. In either case, the correct rate calculation for the metric returns with the next sample.

Time Dilation and Time Skew

If a PMDA is tardy in returning results, or the PCP monitoring tool is connected to pmcd via a slow or congested network, an error might be introduced in rate calculations due to a difference between the time the metric was sampled and the time pmcd sends the result to the monitoring tool.

In practice, these errors are usually so small as to be insignificant, and the errors are self-correcting (not cumulative) over consecutive samples.

A related problem may occur when the system time is not synchronized between multiple hosts, and the timestamps for the results returned from pmcd reflect the skew in the system times. In this case, it is recommended that either timeslave or timed be used to keep the system clocks on the collector systems synchronized; see timed(1M).