Tag Archives: imaging

A Review of Fractal Image Compression and Related Algorithms

I wrote this review paper on fractal image compression, denoising, and enlargement back in 2008 while I was at Caltech, and thought I’d lost it until discovering a copy in my backups. Fractal compression is a fascinating area, and I went on to prototype a GPGPU-accelerated image enlarger as my capstone graphics project. It actually predates OpenCL, and I hope to come back to it someday for an overhaul, using this paper for reference.

A Review of Fractal Image Compression and Related Algorithms

Jeremy Ehrhardt – 2008-03-18

Introduction

Fractal image compression is a curious diversion from the main road of raster image compression techniques. Fractal compressors do not store pixel values. Rather, they store a fractal, the attractor of which is the image that was encoded. This representation of an image has a number of useful properties.

The property that drove initial development of the field is that images with fractal properties can potentially be encoded in very few bits if a close match for their generating fractal can be found. Fractal-like images are frequently found in nature, so fractal compression promised very high compression ratios for natural scenes such as clouds, landscapes, forests, etc. Another useful property is the scale independence of fractals. Fractal-encoded images may be decoded at larger or smaller sizes than the original without major distortion.

This review covers major developments unique to fractal image compression. Many published algorithms incorporate general data compression techniques such as vector quantization and entropy coding (as many non-fractal compressors do), but the details are best covered by general data compression literature, and have thus been omitted. There is also some recent work hybridizing fractal and wavelet compression, which is best left to a review of wavelet-based methods, and has also been omitted.

The first problem in the field was proving that it was possible to find a fractal representation for any image. This was proved possible with the Collage Theorem, which was followed by the development of an algorithm to generate fractal representations for arbitrary images. The algorithm was slow, so subsequent work looked at improving the speed of the search for fractal representations, and also on improving their quality. The partitioning of the image to be encoded was a common target for improvement, and several of the important schemes are discussed below.

Unfortunately, fractal compression was found to have some problems. Early research did not establish that natural images were actually self-similar to such a degree that fractal compression was the best fit. Additionally, optimal encoding was found to be NP-hard. These difficulties are the reason that fractal codecs are virtually nonexistent in the current graphics universe, having been superseded by other codecs with more general applications and higher speeds.

However, the useful properties of fractal encodings have found some use outside of the performance-sensitive area of compression. Two fractal image-processing algorithms are discussed in the last section; one for image enlargement, and one for noise removal. First, we will discuss the development of fractal image compression, from the initial mathematical development of the field in the late 1980s and the first implementation in the early 1990s, up through the various improvements over the following decades.

The Collage Theorem

Barnsley’s 1985 Collage Theorem [BARNSLEY] p. 89 made fractal image compression possible. It states that, given an iterated function system (IFS, a list of repeatedly applied contractive transforms) and some subset of a complete metric space, in order for the attractor of the IFS to be “close” to the given subset, the union of the images of the subset transformed by the member transforms of the IFS must be “close” to the original subset (for a certain definition of “close”). The Collage Theorem thus establishes a guideline for finding fractal approximations to images: work out transforms that take parts of an image to itself. The attractor of the transforms will then be close to the original image.

Barnsley presents a number of examples of simple fractals for which the reader is encouraged to figure out suitable IFS representations by hand. Barnsley does not, however, elaborate on how to find such an IFS representation for an arbitrary image.

Beginnings of fractal image compression

The first algorithm capable of generating an IFS fractal representation for an arbitrary raster image was proposed by Barnsley’s graduate student Jacquin [JACQUIN1]. Jacquin’s fractal coding scheme is based on non-overlapping square block partitions of the image being encoded. It defines a “distortion” measure based on the L2 distance between the pixels of two image blocks, as well as a number of contractive linear transformations that act on image blocks. The transformations used include geometric transforms from isometries of the square blocks, as well as pixel value transforms that change the overall luminance of the block being transformed.

Figure 1: Transformations used by Jacquin’s original algorithm [JACQUIN1, Table 1]. Transformations used by Jacquin's original algorithm.

Jacquin’s algorithm divides the image to be encoded into “domain” blocks of a given size, and classifies the blocks into classes based on their appearance: shade, midrange, simple edge, or mixed edge. Partitioning the blocks in this way reduces the pool of blocks that must be searched in the next step. The algorithm then divides the image into “range” blocks containing a quarter of the number of pixels, and iterates through them. For each range block, it searches other domain blocks in that class, looking for a transformation or transformations that map a domain block to the range block with minimum distortion. (Domain blocks are downsampled to the same number of pixels as range blocks for the distortion calculation.)

Figure 2: The fractal encoding process [JACQUIN2, Fig. 2]. Rᵢ is a range block, D is the domain block pool, and T is the transform pool. The fractal encoding process.

This process produces a list of self-similarity-producing transforms, from regions of the image to other regions of the image, that constitute an IFS with the original image as the fractal’s attractor. If it can’t find a transformation for some domain block with a measured distortion less than a given threshold, the domain block is partitioned into 4 equally sized square children, and the search is repeated with the children as domain blocks.

Reconstructing an image from its IFS representation is a much simpler procedure than the above encoding algorithm. The list of transforms is applied some number of times to an arbitrary image. With each iteration, since the original image is the fixed point of the IFS, the starting image is transformed to more closely resemble the original encoded image. The image may be reconstructed at a higher or lower quality by increasing or decreasing the number of iterations.

Figure 3: Reconstruction of a fractally encoded image: starting image (a), 1 iteration (b), 2 iterations (c), and 10 iterations (d). [FISHER, Fig. 1.10] Reconstruction of a fractally encoded image.

Jacquin [JACQUIN1] demonstrated encoding and decoding on the standard Lena test image. It does not contain coding speed or memory usage statistics for the test image, so it is impossible to discern whether the paper’s fractal coding scheme is competitive with other image compression schemes of the time. Jacquin published a clarified and expanded version of this encoding procedure two years later [JACQUIN2]. Almost all fractal image compression algorithms are extended versions of this algorithm.

Improvements in partitioning

The simple two-level image partitioning scheme used by Jacquin has been the target for replacement in later work. The motivation in changing the partitioning scheme is reducing the error in the mapping between range and domain blocks, and thus improving the fidelity of the image encoding. Fisher describes two partitioning schemes that improve on Jacquin.

Figure 4: Fixed partitioning [WOHLBERG1, Fig. 2a]. Fixed partitioning.

Figure 5: Quadtree partitioning [WOHLBERG1, Fig. 2b]. Quadtree partitioning.

Figure 6: HV partitioning [WOHLBERG1, Fig. 2c]. HV partitioning.

Figure 7: Delaunay partitioning [WOHLBERG1, Fig. 3c]. Delaunay partitioning.

Fisher describes quadtree partitioning [FISHER, Ch. 3] as “a ‘typical’ fractal scheme and a good first step for those wishing to implement their own.” A quadtree is a data structure that subdivides 2D space: each node represents a square area, and nodes may be subdivided into four equally sized square child nodes. Quadtree partitioning is an “adaptive” scheme that produces different partitions for different images, based on their content. Fisher’s quadtree algorithm first subdivides the image to be encoded several times, producing a balanced quadtree several layers deep. The leaf nodes of this initial quadtree are range blocks as in Jacquin [JACQUIN2]. Domain blocks are chosen from nodes that are one level closer to the root than the range blocks, and thus four times the area. All domain blocks are compared with each range block, and if no transformation can be found that maps a domain block to the range block under consideration with less than some threshold level of distortion, the range block is subdivided. It may be recursively subdivided several times until a low-distortion mapping from a domain block is found, or a maximum tree depth is reached.

Jacquin’s original partitioning scheme is almost a special case of quadtree partitioning with a minimum depth of n and maximum depth of n+1, where n is however many partitions it takes to subdivide the image into blocks of the desired range block size, and the levels of the tree closer to the root are not actually represented in the scheme’s data structures.

HV partitioning [FISHER, Ch. 6] is similar to quadtree partitioning in that it involves recursive partitioning of rectangular areas. When a quadtree range block is split, it is always divided into four equal areas. HV partitioning has an additional adaptive element: when a HV-tree range block is split, the orientation (horizontal or vertical) and position of the dividing line is chosen to maximize the difference between average pixel values on either side of the line, with a bias applied to prefer splits that do not create very thin sub-rectangles. The goal of HV partitioning is to create a partition of the image that better corresponds to the structure of the image than a quadtree partition, and thus use fewer ranges when encoding the image, improving encoding time and quality.

A drawback of HV partitioning is the greater variety of possible transformations between range and domain blocks, since both are rectangles of essentially arbitrary dimensions, rather than squares with power-of-two dimensions. This increases both search times and the number of bits required to represent a mapping, although the generally lower number of mappings required offsets the latter.

Fisher notes a weakness in both quadtree and HV partitioning: reconstructed images suffer from visible blockiness at all but the highest reconstruction qualities. Fisher’s quadtree partitioning [FISHER, Ch. 3] deals with this by applying a heuristic deblocking filter during image reconstruction. The filter averages pixels on either side of a range block boundary, with pixels on the inside weighted by a factor proportional to the range block’s depth in the quadtree. Fisher describes a similar procedure for blocks in an HV tree [FISHER, Ch. 6]. Blocky artifacts are by no means unique to these two schemes or fractal image compression in general; IDCT-based codecs such as JPEG and the MPEG family are also susceptible to blockiness. The JPEG standard does not include a deblocking filter, but some MPEG variants do, such as H.264.

One scheme [DAVOINE] not based on rectangular blocks uses a mesh partition made up of triangles, which is generated by repeated adaptive Delaunay triangulation and triangle splitting. The goal of this scheme is to cover the image in triangles that are as large as possible while maintaining a variance in internal pixel brightness values that is less than some threshold parameter (triangles meeting this condition are “homogenous”). Thus, regions with internal brightness boundaries are split up. Delaunay triangulation was chosen to minimize the number of thin triangles, and thus reduce numerical problems when the internal pixels of a triangle are read from the image raster.

Initially, the image is covered with a regular grid of vertices. In the splitting phase, the Delaunay triangulation of the vertices is calculated; then, for every triangle that is not homogenous, a new vertex is added in the barycenter of that triangle. Splitting is repeated until convergence, or until some iteration limit is reached. In the merge phase, the algorithm removes vertices for which all surrounding triangles have similar pixel value mean and variance. Then a final triangulation is performed. This procedure is used to generate both domain and range sets of triangles, with domain triangles permitted a greater variance.

Adaptive triangular partitions generally use fewer blocks than quadtree or HV partitions. An additional advantage of triangular blocks is that inter-block seams do not always line up with pixel boundaries, so there is less visible blockiness in the decoded image. The major disadvantage is that both encoding and decoding require transformations between arbitrarily shaped triangular range and domain blocks; this involves substantially more interpolation than the rectangle-based schemes.

Difficulties of fractal image compression

Since the idea behind fractal image compression is that the image being compressed can be modeled as a fractal, useful fractal image compression requires that the image actually have fractal characteristics so that it can be efficiently modeled that way. Clarke & Linnett [CLARKE] pointed out that while fractal image compression is frequently proposed as a compression scheme for images of nature (such as plants, landscapes, clouds), it is not necessarily the case that those images have the affine self-similarities required for effective compression by existing fractal compression schemes, or indeed, that they have any fractal characteristics at all.

Figure 8: Fractal fern [FERN1]. Fractal fern.

Figure 9: Real fern [FERN2]. Real fern.

As an example, they observed that there are significant differences between a simple computer-generated fractal fern and a photo of a real fern. A fractal fern is highly idealized and may be represented very efficiently by a fractal model, but a natural fern does not have its perfect regularity, and the self-similarity breaks down at some scales. In particular, the fern leaves resemble the whole fern, but are different structures nonetheless, and can only be approximated by small copies of the whole fern.

Wohlberg & de Jager [WOHLBERG2] examined statistical properties of natural images with regard to fractal image encoding, specifically with regard to the deterministic self-similar fractal representations used by all of the above algorithms. Their conclusions seem to confirm the assertion of Clarke & Linnett: “The form of self-affinity considered here therefore does not appear to represent a more accurate characterization of image statistics than vastly simpler models such as multiresolution autoregressive models.”

It has been proved [MATTHIAS] that finding an optimal fractal encoding for an arbitrary image is NP-hard. Furthermore, it has been proved that algorithms derived from Jacquin’s original Collage Theorem-based algorithm do not generate approximations to optimal encodings. So, at least with currently known encoding methods, there is an unavoidable tradeoff between fast (polynomial-time) encoding and obtaining the highest quality representation that will fit in a given amount of space. This is a huge disadvantage relative to other image compression methods: for example, JPEG’s IDCT block coding processes one fixed-size block of pixels at a time, which results in an encoding time that is a linear function of the number of samples in the original image.

Fractal image processing

Authors focused on image compression have treated the iterative reconstruction process for fractal image encodings as a way to control the quality of a reconstructed image. Polidori & Dugelay [POLIDORI] examined the reconstruction process as a procedure for image enlargement.

The most commonly used algorithms for this purpose are variations on polynomial interpolation of samples (the pixels of the original image). The assumption underlying polynomial interpolation is that the samples are from a smooth continuous function. Polidori & Dugelay made a different assumption: that the function that the image was sampled from is fractal in nature, instead of smooth. Then a fractal encoding of that image would be an approximation of that fractal. Since fractals are scale-independent, the image could then be reconstructed at any size from the encoding.

Polidori & Dugelay first examined fractal image encoding and decoding schemes identical to those developed for image compression, based on nonoverlapping partitions of the image. They found that such schemes tended to produce blocky reconstructed images with undesirable artifacts when used to reconstruct images at a larger size than the original image. They then examined the idea of using overlapped domain blocks, which retain redundant information not desirable for compression applications, but useful for image enlargement. They proposed and tested several methods of recombining the overlapped blocks, with some methods yielding enlargements with visual quality comparable to those obtained through classical interpolation.

Another image processing technique that makes use of the scale independence of fractal image encoding is fractal denoising. Additive white Gaussian noise (AWGN) is common in images acquired from noisy sensors or transmitted through noisy communications channels. Images containing this kind of noise can be modeled as a series of samples where each sample is the sum of the value from the noise-free original and a value taken from a Gaussian distribution. Obviously, this model is scale-dependent, and one might expect that fractal image encoding poorly represents AWGN.

In fact, this is the case. Ghazel, Freeman & Vrscay [GHAZEL] noticed that “straightforward fractal-based coding performs rather well as a denoiser”. This motivated the development of their denoising algorithm, which goes a step further than that, and statistically estimates a fractal encoding of the noise-free image from a fractal encoding of an AWGN-contaminated noisy image.

In the first step of the algorithm, the noisy image is examined for areas with nearly uniform pixel values. The differences in value from pixel to pixel in such regions is likely due to noise, so the variance of the AWGN can be estimated from the variance of these regions. The image is then encoded as a series of transformations from domain blocks to range blocks in the usual fashion. (Adaptive quadtree partitioning is used, as it results in the best quality of reconstructed images.) Each transformation in the encoding that affects pixel values (“gray-level” transforms, as opposed to “geometric” transforms} is adjusted using the previously estimated variance according to a simple relation. Finally, the image is reconstructed from the modified fractal encoding.

Figure 10: Comparison of fractal denoising and Lee filtering [GHAZEL]. Comparison of fractal denoising and Lee filtering.

Ghazel, Freeman & Vrscay report that fractal denoising is competitive with or superior to Lee filtering, which is a common locally adaptive linear denoising algorithm. Furthermore, fractal denoising is more likely to outperform Lee filtering as the variance of the AWGN increases, making the fractal method an attractive choice for processing very noisy images.

Conclusion

A review of fractal image compression cannot fail to state this: despite years of improvements, such as the previously discussed partitioning schemes, fractal image compression is not competitive with current block IDCT or wavelet methods. The speed and quality issues noted above have prevented broad use of fractals for storage of compressed images.

However, fractal image processing techniques have found some application: fractal image enlargement was eventually commercialized as a product known as Genuine Fractals [GENUINEFRACTALS] which is well known within the desktop publishing industry as a high-quality image scaler suitable for making large prints from digital photos.

References

BARNSLEY

Barnsley, M. F. (1988). Fractals Everywhere, Academic Press Inc., US.

JACQUIN1

Jacquin, A. E. (1990). A novel fractal block-coding technique for digital images. Acoustics,
Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on.

JACQUIN2

Jacquin, A. E. (1992). “Image coding based on a fractal theory of iterated contractive image transformations.” Image Processing, IEEE Transactions on 1(1): 18-30.

JACQUIN3

Jacquin, A. E. (1993). “Fractal image coding: a review.” Proceedings of the IEEE 81(10): 1451-1465.

CLARKE

Clarke, R. J. and L. M. Linnett (1993). “Fractals and image representation.” Electronics & Communication Engineering Journal 5(4): 233-239.

DIETMAR

Dietmar, S. and H. Raouf (1994). “A review of the fractal image compression literature.” SIGGRAPH Comput. Graph. 28(4): 268-276.

FISHER

Fisher, Y. (1995). Fractal image compression: theory and application, Springer-Verlag London, UK.

POLIDORI

Polidori, E. and J. L. Dugelay (1995). Zooming using IFS. NATO ASI Conf. Fractal Image Encoding and Analysis}, Trondheim.

DAVOINE

Davoine, F., M. Antonini, et al. (1996). “Fractal image compression based on Delaunay triangulation and vector quantization.” Image Processing, IEEE Transactions on 5(2): 338-346.

MATTHIAS

Matthias, R. and H. Hannes (1997). Optimal Fractal Coding is NP-Hard. Proceedings of the Conference on Data Compression, IEEE Computer Society.

WOHLBERG1

Wohlberg, B. and G. De Jager (1999). “A review of the fractal image coding literature.” Image Processing, IEEE Transactions on 8(12): 1716-1729.

WOHLBERG2

Wohlberg, B. and G. de Jager (1999). “A class of multiresolution stochastic models generating self-affine images.” Signal Processing, IEEE Transactions on 47(6): 1739-1742.

GHAZEL

Ghazel, M., G. H. Freeman, et al. (2003). “Fractal image denoising.” Image Processing, IEEE Transactions on 12(12): 1560-1578.

GENUINEFRACTALS

onOne Software, Inc. “Genuine Fractals 5.” Retrieved 2008-03-17, from http://www.ononesoftware.com/products/genuine_fractals.php.

FERN1

Mihályi, A. “Fractal fern.” Retrieved 2008-03-17, from http://en.wikipedia.org/wiki/Image:Fractal_fern1.png.

FERN2

“Olegivvit”. “Leaf of fern.” Retrieved 2008-03-17, from http://en.wikipedia.org/wiki/Image:Fern-leaf-oliv.jpg.

Tagged , , ,