To use dmaudit effectively, you must understand how DMF keeps track of copies of a file's data stored on alternate media, how that data is restored to disk when the user accesses the file, and what happens to those copies when the file is modified or removed.
This chapter discusses the following:
The bit file identifier (BFID) is an object that links a migrated file to copies of its data on alternate media (such as tape). The daemon assigns a unique BFID to each file that it migrates.
A BFID consists of an opaque 16-byte value. (Opaque in this context means that the content and format have no fixed definition. The value can be interpreted differently by different processes.)
The BFID represents a unique ID that the daemon inserts into a migrated file's inode and into database entries that point to copies of the file's data. No two migrated files on the same machine should have the same BFID.
For dmaudit purposes, a migrated file is one whose inode contains a BFID. A file that does not have a BFID is a nonmigrated file (often referred to in DMF documentation as a regular file).
A BFID set is the collection of all database entries, special files, and migrated files associated with a particular BFID. The different components that make up a BFID set define its state. During the normal course of migrating and moving files, there are four states that a BFID set can be moved through. Any time that a BFID set contains a combination that differs from one of the four states, it is considered an error.
To understand these BFID-set states, you must understand how DMF uses each of the possible components of a BFID set. “Migration Life-Cycle” describes possible states and how files move from one state to the next. The remaining sections describe how the DMF daemon handles migration errors, remigration, and modification of files. “DMF BFID State Summary”, summarizes the information.
Data migration uses fields within a file's inode to store the BFID, the state of the file, and other information. dmaudit retrieves and reports this information when discrepancies are discovered.
The BFID is assigned by the daemon. When a file's inode contains a BFID, it indicates that the file is under DMF control. This field is used as a key into the daemon database to check for validity of the file against the database.
The state field shows the current migration state of a file. Table 2-1 lists supported states.
Although the daemon is usually responsible for changing the state of a file, an xfsrestore command may also have to do so when restoring a filesystem from tape. The kernel changes a file's state when the file is modified or removed.
Four files make up the DMF daemon database. They reside in the daemon's home directory (HOME_DIR /daemon_name; HOME_DIR is specified in the DMF configuration file; daemon_name is the name of the daemon object in the configuration file). The files are as follows:
The daemon inserts a new entry into its database each time a copy of a file is created. Each database entry contains the same BFID that was stored in the file being migrated.
There are three types of entries in the daemon database, each of which serves a different purpose:
An incomplete database entry. This entry is created when a user file is currently under migration. It represents a database entry with an empty path field in the dbrec structure (shown following this list).
A complete database entry. This entry is created when the MSP or volume group has completed the file's migration and has entered the file's MSP or volume group key into the path field.
A soft-deleted database entry. This entry is created when the delflag field of the entry is nonzero, indicating the time that the deletion occurred. Although the entry physically exists in the database, no corresponding user file should exist for the entry. When soft-deleted entries are physically deleted from the database they are said to be hard-deleted.
The following dbrec structure is used by the daemon to store the contents of one database entry in the dbrec.dat file:
struct dbrec {
bf_id_t bfid; /* off line file bfid */
dev_t origdv; /* original device no. */
ino_t origino; /* original inode number */
off64_t origsz; /* original file size (in blocks) */
time_t otime; /* original entry time */
time_t utime; /* last update time */
time_t ctime; /* last check time */
time_t delflag; /* delete time */
int userid; /* user id, when archived */
int pathlen; /* length of complete path */
char ofilenm[15]; /* original file name (last element) */
char proc[8]; /* process name */
char path[34]; /* MSP key string */
}; |
Each of the fields in a database entry is explained in Table 2-2. The fields that dmaudit uses in its analysis are marked with an X; other fields are not used by dmaudit in its analysis because they are not always the most current information and their data does not help dmaudit solve inconsistencies. However, you may be able to use them as hints to determine why an error occurred. For example, although it is permissible for a file's ownership to change at any time, in practice that seldom happens, so the user ID of the person who owned the file when it originally migrated is often still the most current information.
Field | Used by dmaudit | Description |
|---|---|---|
bfid | X | |
origdv |
| Specifies the device number of the filesystem in which the file resided at the time it was migrated. If you change your disk configuration, the device numbers of your filesystems may change, causing origdv to become out-of-date. |
origino |
| Specifies the inode number of the migrated file at the time it was migrated. This can also become out-of-date if you dump and restore the filesystem that contains the migrated file. |
origsz | X | Specifies the size of the migrated file in bytes. Because the size of a file cannot change while it is migrated, this field is always current. |
otime |
| Specifies the date and time when this database entry was first created. This is sometimes called the entry origination time. Usually it corresponds to the time the file first migrated. |
utime |
| Specifies the date and time when any field in this database entry was last changed. |
ctime |
| Specifies the origination time when the database entry was created. Although this can be reset by the administrator using the dmdadm command, in normal practice it never changes. |
delflag | X | Specifies whether a database entry is valid or invalid (soft-deleted). This value is initialized to 0 when the database entry is created, and it remains set to 0 as long as the database entry is still valid. When the database entry becomes invalid (for example, if the migrated file was removed), it is soft-deleted by storing the current date and time into delflag in time-stamp format. The database entry cannot be removed immediately because the migrated file might later be restored. You hard-delete the soft-deleted database entries by using the dmhdelete command only when there is no further chance of the file being restored. |
userid |
| Specifies the user ID (UID) of the owner of the file at the time the file was migrated. This can become out-of-date if someone uses the chown command to change the ownership of the file after it has migrated. |
pathlen |
| Specifies the length of the path field described below. This field is used to regulate the creation of pathseg records to contain the overflow characters of the path that will not fit into the path field of this record. |
ofilenm |
| Specifies a null-terminated string composed of up to the first 14 characters of the base name of the file if the pathname was part of the original migrate request. If the pathname was not known at the time the file was migrated, this field contains the string /NONAME. |
proc | X | Specifies the name of an MSP or volume group expressed as a null-terminated character string. This field tells the daemon which MSP or volume group to contact to retrieve a copy of the migrated file. |
path | X | Specifies either a pathname or a key. For MSP and volume group dmfdaemon database entries, it is the key that the MSP or volume group uses to retrieve a copy of the file. For the volume group, the path field is a key into an LS CAT (catalog) database that contains information about the tape that should be mounted to retrieve the copy. It is a null-terminated string. |
The path field of the dbrec base record is a fixed-length field, making the dbrec record a fixed-length record. The path field is supplied by the MSP/LS and can be of any length. The daemon determines the fixed length of the path field of the dbrec record. If the value that the daemon must store in the path field to accommodate the MSP or volume group is longer than the fixed length, the daemon will allocate a sufficient number of pathseg records to hold the overflow path characters.
The pathseg records are keyed from the dbrec bfid and proc fields that uniquely define a dbrec record. When a daemon database record is accessed, any pathseg record path segment extensions are concatenated with the dbrec path field to accurately reconstruct the path value that the MSP or volume group originally supplied to the daemon.
The path field length of the dbrec structure as it is supplied to DMF has a value of 34, which will accommodate the largest dmatls BFID without requiring any overflow pathseg records. If you are running only dmatls, you should not need to adjust the path field of the dbrec structure.
The procedure for adjusting the path field of the dbrec structure is described in DMF Administrator's Guide for SGI InfiniteStorage.
For you to be able to understand all the errors and actions reported by dmaudit, you must understand the valid BFID set states. Figure 2-1 shows the five states a BFID set can be in during its lifetime;the arrows connecting various states show what actions cause a BFID set to progress from one state to another.
Recognizing invalid states is the essence of what dmaudit does. It collects all information available for a BFID set and attempts to determine which BFID set state best matches the available data. When this has been determined, dmaudit then reports as inconsistencies any deviations from the ideal BFID-set state and determines what actions are necessary to return the BFID set to a correct state.
The following sections describe file states and state transitions.
A regular file contains no DMF state information. The file's inode contains no BFID and there are no MSP or volume group dmfdaemon database entries corresponding to the file.
When a migration request is issued on a regular file (either through a dmput command or through automated space management) the DMF daemon sets the file state to MIGRATING, assigns a BFID to the file, and creates an incomplete MSP or volume group dmfdaemon database entry containing the BFID and other information relating to the file.
When all copies of the file's datablocks have been written to the associated media of their respective MSPs or volume groups, the file's state is set to DUALSTATE. This indicates that the file is online and has one or more complete backup copies. In this state, the file's inode contains a BFID. In addition, each copy of the file has a corresponding complete entry in the dmfdaemon database.
Similarly, when an offline file is recalled (either explicitly by using a dmget command or automatically by a normal filesystem access to the file data such as a read system call), the file blocks are retrieved from the backup media and placed back into the filesystem and the file's state is changed to DUALSTATE.
If the file's data blocks were released from the filesystem (either due to an automated space management policy or because the owner executed a dmput -r command), the file's state is set to OFFLINE. The file's inode contains a valid BFID, the file's underlying data blocks are released, and a complete MSP or volume group dmfdaemon database entry exists for the file.
Sometimes an MSP or volume group may encounter problems that prevent it from completing its copy of a file. For example, the LS could encounter tape errors that prevent it from proceeding. The failing LS then returns an error code to show that it was unsuccessful.
When the daemon receives the error reply, it immediately soft-deletes the MSP's or volume group's dmfdaemon database entry (or entries, if more than one MSP or volume group has an entry). The daemon soft-deletes entries by placing the current date and time into a delflag field.
The daemon must then void the BFID (remove the BFID from the user file inode). This is done because the user file could not be migrated exactly as requested.
The daemon then sets the file's state to REGULAR .
The process of migrating a dual-state file is sometimes known as remigrating the file. There is only one minor difference between remigrating a file and migrating a file for the first time: the amount of time the migration takes.
When a dual-state file is remigrated, the daemon checks to see whether a copy of the file already exists on each of the requested MSPs and volume groups. If so, the daemon does not have to instruct the MSP or volume group to make a new copy of the file, because the old copy is still valid.
Unless the DMF configuration has changed, a dual-state file always migrates to the same MSPs and volume groups to which it originally migrated. This means that the BFID set only spends a few milliseconds in the incompletely migrated state before advancing to the fully migrated state.
If the dual-state file is being migrated to a new MSP or volume group, a considerable amount of time is spent in the incompletely migrated state while waiting for the new MSP or volume group to make a copy of the file.
When a user either modifies or removes a file that contains a BFID, the kernel notifies the daemon by generating an event for the file that the daemon has registered to receive. If the user modified the file, a write event is generated and the daemon will remove the BFID from the inode and change the file's state to REGULAR, because the copies of the file are no longer current. If the user removed the file, a destroy event is generated and the daemon will not need to do anything to the file (because it is no longer present).
In either case, the daemon will soft-delete all remaining database entries for that BFID to indicate that those copies are no longer valid.
Table 2-3 shows a summary of the information presented in previous sections.
Table 2-3. State Change Map for DMF
DMF BFID-set state | User file state | MSP or volume group entries | Entries soft-deleted? |
|---|---|---|---|
Incompletely migrated | Migrating | At least one incomplete[a] | No |
Fully migrated | Dual state | All complete | No |
Freed | Offline | All complete | No |
Incompletely unmigrated | Unmigrating | All complete | No |
Voided | [b] | Either complete or incomplete | Yes |
[a] In the incompletely migrated state, complete MSP or volume group entries may also exist. [b] The file state is regular or the file has been removed. | |||
dmaudit uses the information in this table when it examines all the information available for each BFID set. Any BFID set that exactly matches one of the entries in the appropriate table is considered error-free.
If dmaudit finds a BFID set that does not fit exactly into one of the entries in the table, it is reported as having errors and dmaudit lists the actions necessary to remove the inconsistencies.