, , , , , , , , , , , , , , , , , , , , , , , , , ,

It has been suggested that Dimension and TMM are trying to improve the old Iterated SoftVideo technology. This is reasonable given that it was considered obsolete shortly after being introduced.

Since neither company will divulge their research, we can only speculate as to the efforts they may be trying. TMM’s recent enlistment of help from Raytheon, while positive, is also disappointing in that TMM could not succeed on their own. Clearly, forging ahead is a difficult undertaking.

Dimension appears to be employing a different strategy, by dropping compression to focus on realtime upscaling. They hope to find a solution by narrowing down the problem domain.

It is worth noting what others have tried. Iterated had millions of dollars and a bright team of engineers, and they switched to DCT and wavelet techniques. Google has billions of dollars and even more bright engineers, and their VP8 and VP9 codecs are DCT based. The HEVC consortium has similar resources, and they stayed with DCT also. Finally, there are countless fractal imaging research papers written by very clever people at places like the University of Waterloo, and in all this time not a single one has led to a commercial product.

I cannot hope to best such efforts, so I only offer some basic ideas.

SoftVideo gutted the PIFS algorithm to achieve realtime decoding, but at the expense of compression ratio and quality. Could restoring PIFS be an option?

PIFS unfortunately does not offer enough compression for typical imagery, and takes too long to decode. As a result, it has been used only for non-realtime upscaling, in products like Genuine Fractals and Perfect Resize. Part of the problem is that PIFS requires serious filtering to look good, which increases the decoding time.

Gaining better compression is hard. Using larger blocks in the quadtree only works for images that have correspondingly large areas of similar color. DCT systems can do this because they can expand the waveform tables to cover the extra patterning possibilities of larger blocks, but in fractal systems it makes finding block matches exponentially harder.

Using nonsquare (rectangular) blocks might help, as blocks can be better fitted to the imagery. However, the block shape must then also be encoded in the file. Depending on how blocks are allowed to split, however, the extra data need not be excessive, so there is some potential here. The block shape variability must also be taken into account in the decoder, although this should not be a problem.

If we limit upscaling to blocks which happen to lie on region edges which are larger than the block, then a tiny search of the immediate area can work. This is essentially the Dimension approach. An optimization would be to examine the block for sufficient contrast and to not bother with block searching if the contrast is too low. This will cause the upscaling quality to vary, but hopefully the absence of upscaling in low-contrast blocks will not be noticeable. The assumption here is that the time spent determining contrast is significantly less than the time spent searching. On a GPU or FPGA, however, the savings may be moot because the output frame buffer cannot be released for display until all the shader units have finished executing, so even if one shader unit needs to do a block search, all the others will be waiting for it to finish. On the other hand, dividing the work into two shader passes might work: one to analyze for contrast and develop a mask, and the other to do block searches for the blocks indicated by the mask.

The only downside is that many small high-contrast edges would be left unscaled. A review of the quality will need to wait until I can prototype the necessary software.