Chapter 23. Introduction to the Compression Library

The Compression Library, libcl.so, provides a flexible, extensible, and algorithm-independent software interface for compressing and decompressing audio, video, and image data. Developers may also choose to incorporate the licensable built-in interface to third-party audio compression software from Aware, Inc., which is described in Appendix B, “Aware Scalable Audio Compression Software.”

Using the Compression Library (CL) involves three concepts, each of which are discussed in a separate chapter in this part of this guide:

In this chapter:

Overview of the Compression Library

Compression is the process of shrinking the size of the data without changing its content significantly. Compact data can be stored more efficiently and can be transmitted faster than raw data. For example, certain compression methods can allow you to store 10 to 20 times as many compressed images in the space required to store a single uncompressed image. Compression extends the capabilities of digital media delivery and storage systems because it encodes data more efficiently.

Compression Library Applications

Compression Library applications are far-reaching. The primary goal of the CL is to improve the data delivery and storage capabilities of applications that use digital media.

The Compression Library can be used with the Audio File Library, and data used by the IRIS MediaMosaic™ tools, Movie Player and Movie Maker. Other applications include:

  • Information delivery and storage, including multimedia presentations, publications, interactive training, archiving, and annotation. For example, you can use MoviePlayer as the playback mechanism for an information delivery application. Showcase™ can be used as the base medium from which to launch separate executables of the MoviePlayer to play back prerecorded movies.

  • Telecommunications (video/voice mail, phone, and teleconferencing)

    Compression allows faster transmission of data. This is especially useful when the data rate is limited by the transmission medium. Cost savings can also be realized when transmitting data over a medium where you are billed on the basis of either access time or number of bytes transferred.

  • Animation previewing

    Images can be compressed frame-by-frame, as they are rendered, for previewing 2D and 3D graphics animations in live action before recording to video tape. Previewing saves time for animators because they don't have to render and record a full-data animation to tape every time they want to check the motion sequence.

  • Movie (audio and video) editing

    Movie editing can be done entirely in the digital domain using a tool such as MovieMaker, instead of editing a tape recording. Compression lets you store more data and decompress it as you open files for editing.

Figure 23-1 shows a few of the applications that are possible in a server-client environment.

Figure 23-1. Server-Client Compression Applications


Compression Library Features

The Compression Library features:

  • algorithm independence

  • hardware independence

  • support of industry standard algorithms

  • support of Silicon Graphics proprietary algorithms

  • binary compatibility across Silicon Graphics platforms

Because the CL is algorithm-independent, you need to know only the basic application interface (API) to use any of the supplied algorithms. You can query the library for the available algorithms, and you can add your own algorithms to the library. A pass-through capability allows you to pass data through the routines without using an algorithm.

The libcl API provides facilities for working with audio, still images, sequential frames of data (movies), and a buffering mechanism for random access of compressed data.

The buffering facility allows independent buffering of compressed data and decompressed frames, with synchronous or asynchronous access, either external or internal to the library. Separate processes can be used for supplying data, compressing/decompressing, and retrieving data.

The API also uses a set of global state parameters, similar to those found in the Audio Library, libaudio, to establish and manipulate compression attributes.

Compression Library Basics

This section introduces compression technology and compression standards. It provides useful background information that you should know before using the Compression Library.

Lossy versus Lossless Compression Methods

Compressed data isn't always a perfect representation of the original data. Information can be lost in the compression process. A lossless compression method retains all of the information present in the original data. A lossy compression method does not preserve 100% of the information in the original method. Some methods incur more loss than others, so the amount of loss that can be tolerated by your application might affect your decision about which compression method to use.


Note: In general, video compression algorithms are designed to work on camera-generated images. Computer-generated images often contain text and line drawings that compression algorithms can't compress as well as smooth-shaded computer images, which approximate camera video.


Compression Standards

Standards provide a common ground for developers to share technology. Standards for the audio and video industries are constantly being developed and changed in response to new technology. The Compression Library supports these standards through the use of algorithms and parameters.

Compression Library Algorithms

Algorithms are provided within libcl for audio and video standards and for Silicon Graphics proprietary algorithms that have significant benefits. You can query the library for the available algorithms, and you can add your own algorithms to the library. Algorithms are grouped according to the type of data they operate on: still images, motion video, or audio.

Still Image Algorithms

Although any algorithm can be used for still images, the JPEG (Joint Photographic Experts Group)-baseline algorithm, which is referred to simply as JPEG for the remainder of this guide, is the best for most applications.

JPEG is a compression standard for compressing full-color or grayscale digital images. JPEG is most useful for still images; it is usable, but slow when performed in software, for video. You can use the Cosmo Compress option, a hardware JPEG accelerator, in conjunction with the Compression Library for compressing video to and decompressing video from memory or for compressing to and decompressing from a special video connection to Galileo Video, IndyVideo, or Indigo2 video.

JPEG is a lossy algorithm, meaning that the compressed image is not a perfect representation of the original image, but you may not be able to detect the differences with the naked eye.

The amount of compression and the quality of the resulting image are independent of the image data. The quality depends on the compression ratio. The Compression Library lets you select the compression ratio that best suits your application needs.

JPEG is designed for still images and is usable, but slow, for video. JPEG is typically used to compress each still frame during the writing or editing process, with the intention being to apply another type of compression to the final version of the movie or to leave it uncompressed. JPEG works better on high-resolution, continuous-tone images such as photographs, than on crisp-edged, high-contrast images like line drawings.

Movie Algorithms

For the best quality in a final movie, all image manipulation and storage should be with uncompressed images until the final movie is produced, at which time the images can be compressed. Repeatedly compressing, decompressing, and then recompressing images reduces the image quality.

The Compression Library supports the following algorithms for motion video compression/decompression:

CL_MPEG_VIDEO 


Moving Pictures Expert Group is a standard that is designed for extreme compression of motion video while maintaining high image quality. It is a lossy algorithm that is capable of producing higher compression ratios than both JPEG and MVC1.

MPEG I is designed to give the best possible quality for a 1.2 million bits per second (Mbps) data rate for audio as well as video data. Other data rates are possible.

The quality depends on the sophistication of the encoder. Quality (subjectively evaluated) between VHS and S-VHS can be achieved for images whose frame size is 352 × 240 with the 1.2 Mbps data rate, which is possible to obtain from a CD-ROM in real time.

MPEG is an asymmetric coding technique—compression requires considerably more processing power than decompression because MPEG examines the sequence of frames and compresses it in a optimized way, including compressing the difference between frames using motion estimation.

The compressed data stream is designed so that the video can be played forward or backward. This makes MPEG well suited for video publishing, where a video is compressed once and decompressed many times for playback.

CL_MVC1 

Motion Video Compressor 1 is a Silicon Graphics proprietary algorithm that is a good general-purpose compression scheme. It is a color-cell compression technique that works well for video, but can cause fuzzy edges in high-contrast animation. MVC1 is a fairly lossy algorithm that does not produce compression ratios as high as JPEG, but it is well suited to movies.

CL_MVC2 

Motion Video Compressor 2 provides results similar to MVC1 in terms of image quality. MVC2 compresses the data more than MVC1, but takes longer to perform the compression. Playback is faster for MVC2, because there is less data to read in, and decompression is faster than for MVC1.

CL_RLE 

8-bit Run Length Encode is a lossless algorithm for compressing 8-bit RGB. It is the only algorithm currently available to directly compress 8-bit RGB data (CL_RGB332). Although this algorithm is lossless, it doesn't save as much space as the other compression algorithms—typically less than 2:1 compression is achieved. The libcl implementation of RLE does not use a standard RLE method. This is a lossless compression method that uses run-length encoding (RLE). Run-length encoding compresses images by storing a color and its run-length (the number of pixels of that color) every time the color changes. It is a good technique for animations where there are large areas that have identical colorsRun-length encoding replaces pixel values that are repeated for several pixels in a row with a single pixel at the first occurrence of a particular value, followed by a repeat count representing the number of subsequent pixels of the same value.

CL_RLE24 

24-bit Run Length Encode is a lossless algorithm for compressing 24-bit RGB.

CL_RTR1 

Real Time Record is a Silicon Graphics proprietary algorithm designed for recording directly from a camera or VTR to disk or digital audio tape (DAT) by compressing on the fly. The quality achieved is dependent upon the processor performance and video hardware that is available.

Audio Algorithms

The Compression Library supports two audio algorithms that are based on international standards:

CCITT/TSB G.711 μ-law 


compresses 16-bit audio to 8-bit audio using a geometric function that takes advantage of the fact that human hearing is more sensitive to differences at lower volume levels. It is designed for rapid compression and decompression at a 2:1 compression ratio.

CCITT /TSB G.711 A-law 


compresses 16-bit audio to 8-bit audio using a different geometric function that takes advantage of the fact that human hearing is more sensitive to differences at lower volume levels. It is designed for rapid compression and decompression at a 2:1 compression ratio.

In America, μ-law compression is generally used. In Europe, A-law is more prevalent.

Compression Library Data Formats

This section provides a brief introduction to digital media formats. It describes the fundamental nature of digital data and introduces some basic terminology that you should know before using the Compression Library.

Many different formats exist for audio, image, and video data. The Compression Library supports the most common formats, but it doesn't restrict you to using one of these formats. In fact, you can define your own unique format to suit your application needs. For example, you can define a file format that contains interleaved frames of audio and video for a movie application, or you can define a file format that contains multiple tracks of audio data for an audio-mixing application.

The following sections describe some of the data formats you are likely to encounter when developing applications that use the Compression Library.

Audio Data Formats

Audio data occurs in a stream, which can be divided into units called blocks. Audio data can be monaural (mono), which has one channel embedded in the audio stream, or stereo, which has two channels embedded in the audio stream. In a stereo audio stream, the left and right channels are interleaved. The Compression Library provides support for both mono and stereo audio. Parameters are used to distinguish between the two data types.

Depending on the original source of the audio, it may have other distinguishing characteristics such as the resolution. See Part II, “Digital Audio and MIDI Programming,” for more information about audio data formats.

Image Data Formats

Image data is contained in a frame. You need to supply the height and width of an image frame when using the libcl routines that compress/decompress image and video data. The ordering of pixels within the frame depends upon the source of the data. Top-to-bottom is the default data orientation for Compression Library routines. You can use the CL_ORIENTATION parameter to specify how pixels are ordered.

The Compression Library works with data that is contained in frames. A frame is defined as a sample at one instant of time so that:

1 audio sample: 

mono 8 bit = 1 byte
mono 16 bit= 2 bytes
stereo 8 bit = 2 bytes
stereo 16 bit = 4 bytes

1 video frame: 

width * height * components * bitsPerComponent/8 = n bytes

Video Data Formats

Video data is a stream of sequential frames of image data. Some video formats have special frames called keyframes that contain information for a block of frames that is treated as a single unit. There are a variety of video formats. The Compression Library supports a set of formats for all algorithms.

Video data can be either color or black-and-white. If you are working with video data, you should be familiar with such terms as component video, composite video, chrominance, luminance, and RGBA data.

Implicit color space conversion occurs whenever the specified original format does not match the specified internal format, that is, the format that is compressed directly. Conversion from the original format to the internal format occurs on compression, and conversion from the internal format to the original format occurs on decompression. A different original format can be used on decompression than was used on compression.


Note: The parameter CL_BEST_FIT can be used when compressing to automatically choose the best internal format for a given original format.

The Compression Library supports these video formats:

CL_RGBA 

R, G, B, and A data are 8-bit components packed into the 32-bit word as:

0xAABBGGRR

where:

AA contains the 1-byte alpha value.
BB contains the 1-byte blue value.
GG contains the 1-byte green value.
RR contains the 1-byte red value.

RGBA component values range from 0 to 0xFF (255). For this format, compressionFormat.components should be set to 4.

CL_RGBX 

R, G, B, and X (don't care) data are packed into the 32-bit word as for CL_RGBA. Note that with this format, only the R, G, and B values are compressed.

CL_RGB 

R, G, and B data are packed into a 24-bit word. Note that with this format, the RGB triplets may cross the 32-bit word boundaries.

CL_RGB332 

R, G, and B data are packed into an 8-bit byte as:

0xrrrbbggg

where:

rrr is three bits of red.
bb is two bits of blue.
ggg is three bits of green.

CL_GRAYSCALE 


Four 8-bit luminance bytes are packed in a 32-bit word.

CL_Y 

Equivalent to CL_GRAYSCALE.

CL_YUV 

Three 8-bit components, Y, U, and V, are packed into 24 bits as:

0xUUYYVV

where:

UU contains the chroma-blue value.
YY contains the luminance value.
VV contains the chroma-red value.

CL_YCbCr 

A synonym for YUV[4] . Y is for luminance, Cb (chroma-blue), and Cr (chroma-red) are for chroma.

CL_YUV422 

Two luminance components are packed into a 32-bit word with one U-V pair. In other words, the chroma components are sampled with half of the horizontal rate of the luma, which is known as 4:2:2 sampling. Two pixels are represented by this 32-bit word as (Y1, U1, V1) and (Y2, U1, V1). The order of the components is:

0xU1Y1V1Y2

where:

U1 contains the chroma-blue value.
Y1 contains the first luminance value.
V1 contains the chroma-red value.
Y2 contains the second luminance value.

CL_YUV422DC 


(duplicate chroma) The chroma is subsampled by 2 vertically in addition to horizontally, and is packed the same as CL_YUV422, except that U and V are duplicated on the odd lines. CL_IMAGE_WIDTH must be even when using this format. This format is convenient for storing 4:1:1 sampled data, which is analogous to 4:2:2 sampling with the addition of half-sampling of the chroma vertically. Sometimes 4:1:1 is used to indicate full vertical and one-quarter horizontal sampling.

Table 23-1 shows the formats that are supported directly, that is, formats that do not require color conversion—for each algorithm that is currently implemented in libcl.

Table 23-1. Video Formats Not Requiring Color Conversion

Algorithm

Format

UNCOMPRESSED

Any format

JPEG

CL_YUV and CL_GRAYSCALE

MVC1

CL_RGBX and CL_GRAYSCALE

MPEG

CL_YUV422DC

RLE

CL_RGB332

RTR1

CL_YUV, CL_YUV422, CL_YUV422DC, and CL_GRAYSCALE


Movie Data Formats

The Compression Library supports the movie formats used by the Movie Maker and Movie Player tools.

Header Formats

Sometimes data is prefaced by a header that contains information about the data. The CL provides routines for extracting header information, which can also contain CL state parameters.

A typical header begins with a start code and a size:

Header Start Code
Header size (in bytes)

followed by parameter-value pairs such as those listed in Table 23-2.

Table 23-2. Parameters Contained in Header Data

Parameter

Information Supplied

CL_ALGORITHM_ID

Algorithm scheme

CL_ALGORITHM_VERSION

Version of the algorithm

CL_INTERNAL_FORMAT

Format of images immediately before compression

CL_NUMBER_OF_FRAMES

Number of frames in the sequence

CL_FRAME_RATE

Frame rate

CL_IMAGE_WIDTH

Width (image and video data only)

CL_IMAGE_HEIGHT

Height (image and video data only)

Other parameters are possible, see Chapter 25, “Using Compression Library Algorithms and Parameters,” for a complete list of parameters available.



[4] The video specification of YUV and YCbCr dictates a scale factor for each component when converting between these formats. For convenience, the CL defines them as equal.