, , , , , , , , , , , , , , ,

Image compression evolved three main approaches.

In the early days, RLE and Lempel-Ziv type methods were applied to pixels arranged in horizontal scanlines. This is because most raw image data is stored as scanlines and these compression methods are simple and operate on simple streams of pixels arranged as such. However, they could not fully exploit data redundancy that occurs in two dimensions, such as large flat areas of color. Instead, such regions would be compressed as multiple “slices”.

JPEG used a DCT (discrete cosine transform) algorithm based on a tiled, or block approach. The image is divided up into a set of nonoverlapping blocks, each usually measuring 8 x 8 pixels. Each block is then broken down into its component wave frequencies using a Fourier transform. A description of the frequencies is smaller than the raw pixel data. The process is fast, operates independantly of other blocks, and needs no large codebooks. It also offers a wide, smooth range of user-selectable quality/efficiency tradeoffs. The method is lossy, but this is not a problem at the final media publishing stage. At the end of the compression, the coefficients of each block’s DCT formula are squeezed further using Huffman (entropy) encoding.

Textures for video games needed to be decoded faster and use smaller blocks, so special methods were made just for them. Efficiency is not great, but the algorithms tend to be very simple and can be executed rapidly by video card hardware.

Fractal image compression is also block-based, but seeks to exploit blocks having a similar appearance, and then encoding their relationship instead of the pixels. There is an assumption that an image will have enough blocks that resemble each other. To help blocks match, they may be flipped and/or rotated, and this spatial transform is also stored in the compressed file. I will go into more detail about fractal compression in a later post.