Activity 3: Image Types and Formats
File formats
The image I used for the comparison of file sizes is an HDR image of the Oblation Plaza overlooking University Avenue during sunset taken by myself, shown in Fig. 1. Its histogram is shown in Fig. 2. I’ve already post-processed the image so that it displays properly on print or on non-HDR-capable devices. The file sizes themselves are shown in Table 1. In the interest of minimizing the size of this document, I will no longer include the images saved in other file formats if there is no perceivable change in quality/characteristics.
Format | Size (kB) |
---|---|
16-bit BMP | 3418 |
8-bit PNG | 3240 |
24-bit JPG | 1474 |
8-bit TIFF | 1053 |
8-bit GIF | 533 |
JPG (binary) | 391 |
In order to produce a binary image, I applied a threshold of 192 to Fig. 1. The result is shown in Fig. 3.
In order to produce an indexed image, I first imported the image into Photoshop and set it to Index mode. I then
extracted the 256 most common colors and saved them as a Photoshop Color Table (.act
). I then
applied this color indexing to the image and saved it, one as a TIFF
and one as a GIF
.
The result of the TIFF
indexing along with its color table is shown in Fig. 4 and
Table 2, respectively. Notice that the ground portion of
the image still looks decent because the boundaries of each color are considerably well-defined. However, the
sky portion exhibits some visible degradation due to the smooth gradients in the original image. The
GIF
indexing in Fig. 5 exhibits degradation even in the ground portion. I
found out that this was due to the difference in the way the colors are sampled. In the former, the sampling of
colors is locally adaptive, which allows transition of colors to be smoother, while in the latter, the sampling
is uniform.
Index | R | G | B | Color |
---|---|---|---|---|
0 | 74 | 24 | 26 | |
1 | 177 | 168 | 169 | |
2 | 248 | 168 | 184 | |
3 | 232 | 168 | 184 | |
254 | 7 | 7 | 7 | |
255 | 255 | 255 | 255 |
History
Camera sensors are composed of arrays of light-sensitive detectors, and each component of these detectors have
varying sensitivities and transfer functions which convert incident light into the red, green, and blue
channels. In the RAW
file format, the raw information captured by the sensor is stored without any
compression or manipulation. This technology is still used at present especially by professional camera
manufacturers—such as Nikon’s NEF
format and Canon’s CR2
format—in order
to preserve all the information received by the camera’s sensor, which provides an advantage when
post-processing later on, especially when shooting in low light. As imaging technology developed, so has our
ability to reproduce larger images. Eventually, storage capacity became an issue, which birthed the challenge of
data compression.
One of the first formats to have taken up this challenge is the Graphics Interchange Format (GIF
)
in 1987 [1]. It uses the Lempel-Ziv-Welch (LZW
) lossless compression algorithm, which provides up to
25% compression. However, GIF
is a pixel-dependent, 8-bit indexed format, so images could not show
their full color range in this format. The file format quickly fell out of favor when its compression technique
was patented in 1995 by Unisys Corporation [2], who attempted to collect royalties from GIF
users.
Capitalism strikes again!
In the early 1990’s, the International Organization for Standardization (ISO) and the International
Telecommunication Union (ITU) formed a joint committee called the Joint Photographic Expert Group, the creator
of the eponymous JPEG
format. Its most common incarnation is the 24-bit version, which allocates 8
bits for each color channel, and its strength lies in its lossy discrete cosine transform (DCT)-based
compression format to achieve up to 5% compression. The way it is encoded allows the user to set the compression
level desired, and for normal day-to-day usage, the data it discards is usually imperceptible.
In 1986, the Tagged Image File Format (TIFF
) was developed by Microsoft and Aldus (now merged with
Adobe), and takes its name due to its heavy dependence on tags, which relay the image information to the program
accessing the file. It is highly versatile, supports a wide range of sizes, resolutions and color depths
[2], and can use LZW
, ZIP
, or other compression methods. However, its large file size as
a consequence of its complexity limits its practical use.
In 1995, the Portable Network Graphics (PNG
) format was created as the spiritual successor of
GIF
and hailed by some as the future of bit-mapped images [2]. Out of
all the formats discussed here, it is probably the most flexible and allows for lossless compression,
device-independent gamma correction, device-independent brightness consistency, transparency (now better-known
as the alpha channel), redundant self-verification of data integrity, interlacing, and various pixel mapping
schemes (indexed, grayscale, or true-color RGB).
Programming basic imaging functions
Some of the libraries that can be used for image processing in Python include the matplotlib
,
opencv2
, and PIL
modules. Some basic functions include:
cv2.imread
: reads an image from the computer and loads it as an n-dimensionalnumpy
array.matplotlib.pyplot.savefig
: saves a validnumpy
array as an image with the specified file format.PIL._getexif
: extracts the image metadata and stores it as a dictionary.matplotlib.pyplot.hist
: plots the histogram of an image.cv2.COLOR_RGB2GRAY
: converts an RGB image to grayscale; can be passed as an argument incv.imread
.
References
Roelofs, G. (1999). History of the portable networks graphics format (1999).
R. H. Wiggins, C. Davidson, R. Harnsberger, J. R. Lauman, and P. A. Goede, Image formats: Past, present, and future, RadioGraphics 21 (2001).