Arnaud Jacquin, a student working with Michael Barnsley, added a block scheme to Barnsley’s original fractal image compressor. This let the compression be fully automated, improving the encoding time (although encoding would still take a long time compared to other methods). The scheme is generally known as the partitioned iterated fractal system, or PIFS for short. It is described in US patent 5065447, which has expired.
Side note: Dimension patents 8351509 and 8639053 (and hence the Iterated VDK and SoftVideo/Trudef) are not based on this scheme (they do not even reference the 5065447 patent). We will explore the reasons and consequences later.
What seems magical about PIFS is how it decodes the original image from data that contains no pixel information at all. As we shall see, the algorithm is very simple and the magic is easily explained.
First, we divide the original image into a set of equal-sized, nonoverlapping blocks. We will call these range blocks. For our purposes we will use a block size of 4 x 4 pixels. Thus our 512 x 512 Lena image will measure 128 x 128 range blocks.
Next, for each range block, we search the image for a block which is twice as large (8 x 8) but looks similar to the range block. We call this sought-after block the domain block. As we search, it does not matter if a domain block overlaps other domain blocks we are considering.
The choice of block size for both domain and range is arbitrary, but normally we look for a domain block larger than the range block. The difference in size is crucial to the fractal interpolation effect during decoding, which is what makes the so-called “resolution independant” effect possible.
As we search, we can flip and/or rotate the domain block to improve our chances of finding a match. We keep track of how well the block compared, and when the amount of error drops below a predefined threshold, we accept the match and stop searching. We could search all the domain blocks and take the very best one, but that would also be much slower. In our example, keep in mind that we have 16,384 range blocks, each of which could be compared with over 250,000 domain blocks (and this is just for a 512 x 512 image).
Once we have our domain block, its location, brightness difference, and spatial transform are placed into an item called a fractal code, and this code is written to the compressed file. Then we repeat the search for the next range block, until all the range blocks are associated with a domain block. When we are finished compressing, every range block will be described by a fractal code.
To decode the compressed file, we need to process each fractal code to reconstruct the pixels of the original range blocks. Since there is one code for each range block, we inherently know where each range block is.
We start decoding by setting up a pixel frame buffer and clearing it to flat gray. The actual color could be anything — even a complex image if we wanted — but 50% gray is a good overall choice. As we decode, we read from this buffer and write to a second buffer. When all the codes have been processed, the contents of the second buffer are copied to the first buffer, and we go through all the fractal codes again. This iterating of the codes is where the “iterated” part of the scheme comes from.
Each fractal code tells us where the domain block is and how much to brighten or darken the pixels from the first buffer as we copy them to the second buffer. The code also tells us if the pixels should be flipped and/or rotated. By using a simple converging formula to compute pixel brightness, we guarantee that no matter how many iterations we apply, the range blocks in the second buffer (and therefore the final image) will stabilize to the correct color.
On the first iteration, the second buffer will look very pixellated, because all we will do is shift the brightness of flat gray pixels. The pictures here use 8 x 8 blocks so that you can see the effect more easily.
However, we copy the second buffer back to the first, and now the second iteration looks like this:
The pixellation is reduced, and after several more iterations, the original image appears:
To zoom into the image, all we need to do is to increase the resolution of the frame buffers, scale up the pixel size of both the range and the domain blocks, and iterate a few more times. Since the iteration process produces smaller pixels at each step, we can always have sharp pixels regardless of the decoded image resolution. All we have to do is keep iterating.
Is this resolution independance? Yes and no. Technically yes; we are obtaining sharp pixels. Then again, so does every other upscaling technique (they just look blurrier).
Fractal interpolation is precisely that: the extra pixels are computed by a brightness adjustment derived from the original image, so no new real information can ever be introduced. Zooming onto a picture of skin does not reveal fine hairs and pores; faces in a distant crowd do not expand into eyelashes and teeth, numbers on licence plates do not become more legible. Features become cleaner, but not more detailed. Below is a comparison of an 8x fractal zoom (using 3 x 3 blocks) and an 8x bicubic resampling:
How about a 4x zoom versus bicubic?
What if we blur the fractal zoom to not look so blocky? The result, interestingly, is pretty close to the bicubic resampling:
As we can see, extra detail pixels get added during a fractal zoom, but they do not add any useful information. Blocks are applied in terms of color matching without any thought as to how to optically increase detail. Large curved edges such as the bottom of the lady’s nose fare pretty well, but her eyes gain “details” that do not resemble what one would see by actually looking closer.
There are some other limitations of the algorithm. First, the edges of range blocks are noticeable (although this can be lessened by filtering them).
Second, image quality worsens as larger block sizes are used, because the number of domain blocks shrinks geometrically. This greatly hampers the use of large blocks to achieve higher compression. Despite what may seem like a large number of domain blocks, it is a tiny fraction of the possible pixel arrangements in a range block, so a lot of block matches never approach high quality.