Understanding Formats and Conversion Issues3/4/98J. L. Mohler A couple of years ago, surfing the web would have revealed a myriad of file formats floating around the web. Likely if you were part of the “in-crowd” who found this funny little piece of software called Mosiac and installed it, you would have found many, many file formats that couldn’t be viewed (without the aid of a helper application). Yet today’s web is a little more “civilized.” The predominance of graphics files on the web are GIF and JPEG, with PNG quickly catching. Yet, you still may find a straggling TIFF, BMP, PICT, or EPS. At this point you may be asking yourself, “What are all these file format acronyms?” That's what this article is for. Using GIF and JPEG Formats
Although the web is designed to be open to all graphics formats, note that the formats that can be delivered and received depend upon the client as well as the server. Nonetheless, the even bigger issue is focused at whether a particular format is an open standard or a commercial standard. Open standards (formats) can be used without regard to copyright or patent considerations. Commercial standards, such as GIF and some varieties of the JPEG format, have patent or copyright considerations that must be taken into account. As the developer or a page, program, or application, you become liable for fees associated with commercial standards. This is the main reason for the development of open standards such as the new Portable Network Graphics (PNG) format. The GIF format can be interpreted directly by the browser. This file format can support up to 256 colors (8-bit color data) as well as a special layer of data called the transparency layer. Yet, realize that the GIF format was developed by CompuServe as a means of distributing display quality images over their online service. GIF files were never actually intended for print and this is the main reason that they only support 256-color image data. They were designed to be lightweight files that could be exchanged electronically. Aside: Remember at the time the GIF format was developed, 300 BPS was the common modem transfer speed and even the GIF format taxed this extremely underdeveloped technology. However, the GIF format uses a proprietary compression scheme which has cause quite a copyright skirmish over the past couple of years. In short, UniSys corporation, developers of the LZW compression scheme used in the GIF format (developed by CompuServe Information Services), announced they would be suing for patent fees from software developers using the format. This "problem" has led to the development of a new and unique file format, Portable Network Graphics (PNG), which is focused at quickly replacing both the GIF and JPEG formats. Later in this chapter you’ll read more about the PNG format. Yet even though the GIF format is pretty common, there are actually two different flavors of this image format: 87a and 89a, presumably named after the years in which they were developed. The one of most interest to web developers is the GIF 89a format which supports transparency. Much like the masking feature of PhotoShop, the transparent GIF format correlates a single color in the image as being transparent. Therefore, when the image is displayed in the browser, specified <IMG> tag, background elements appear in place of the assigned transparent color value. As the alternative to the low color limit of the GIF format, the JPEG format presents a standard that excels at delivering photo realistic images. Yet note that JPEG delivers rasterized vector images poorly. Only use the JPEG format for high color raster images that have not been palettized or reduced in color depth. Probably the biggest advantage to JPEG is its lossy compression scheme, which allows high quality images to be delivered with close to a 10:1 space savings compared to GIF. Progressive Images & Interlacing
To enable a progressive download the image must be digitally recorded or saved in a special way. The interlace option in any graphic file format causes the data to be saved non-sequentially. Rather than storing each line of pixels as they appear from top to bottom in the image, interlacing stores every 4th, 8th or 16th line in that order. So rather than storing line 1, 2, 3 and so on, an interlaced file stores line 1, line 8, line 16 and so on. It then repeats at the top of the image with line 2, line 17, and so on as shown in Figure 2. This way the image can be progressively drawn as the image is downloaded. Many of the latest imaging applications such as Adobe PhotoShop 4.0 supports this new rendition of the original JPEG format. The progressive JPEG is quite impressive but still data loss can be a negative since it uses lossy compression. Create GIF and JPEG Graphics
Save the image as a high resolution TIFF or PSD file first. Remember that JPEG’s lossy compression will loose some data. Saving a high quality version allows you to go back to the original image if modifications are necessary. Verify that the image you are saving is a 24-bit image using the Image | Mode menu. If the image is an 8-bit image, use the GIF or PNG format instead of the JPEG format. Choose the Save As or Save option from the file menu. Set the Save As drop-down menu to JPEG format. When prompted with the JPEG Options dialog box (shown in Figure 3) accept the default compression settings. You can test several lossy settings to get a desirable file size. However, remember that the “smaller” the file, the more data is lost. If you want the image to be progressive, select that option in the dialog box as well. Using the Portable Network Graphics (PNG) One of the most noteworthy occurrences that has developed in the world of web graphics as of late is this new graphics format. The Portable Network Graphics (PNG, pronounced “ping”) format has some very distinct advantages over both the JPEG and GIF formats and seeks to better standardize the graphics found on the web in addition to making it legally open. The new PNG or Portable Network Graphics format supports Indexed-color up to 256-colors, true color images, progressive display, transparency, and automatic lossless compression. An additional feature is the use of pre-compression filters which prepare the image data for optimal compression. In general there are five filter types that can be applied to data within the image. Note that the filters used with the PNG format are applied to the bytes that make up the image, and not the pixels or their colors. In addition, the filter works across individual scanlines that make up the image. Keep in mind that the biggest reason for the introduction of the PNG format is to eliminate many of the problems associated with the LZW compression scheme. In addition, certain optional characteristics of JPEG could also lead to similar proceedings. The biggest reason for the introduction of the PNG format is the need for an open standard (format). In addition the PNG format includes several features that also make it more advantageous than JPEG or GIF. The PNG format supports RGB color images up to 48 bits, full masking (alpha channels), and image gamma information. It will be interesting to see how quickly this format catches on. The big two (Netscape and Explorer) currently support the new format. In addition, many of the latest image editors allow the developer to generate files in the new format. Note that if you are using an older browser, you can probably get a plugin that will allow your browser to view PNG images. Creating a PNG File
In the filter section, select one of the available filters. Since the filters affect the image’s bytes, rather than pixels, you may want to try several of the filters to obtain better compression results. Using Alternative Formats
TIFF
PICT
Compression
As you may have read in other articles, one of the biggest problems with raster images is their size. To overcome this hurdle, compression schemes have been developed to help reduce the file size of raster images. Realize that almost every raster image has redundant data. For example, an image with many blue hues in it has redundant data due to the repeated definition of the blue pixels in the image. Compression schemes take the redundant or repeating data and substitute tokens or representative characters for the repeating data, thus reducing the file size. Most compression schemes, such as the ones used in BMP and TIFF files, are transparent to the user. Many times you don’t even know that the compression is occurring, but the compression can significantly reduce the size of the file. Compression schemes use an algorithm, or codec to compress and decompress the image file. A codec stands for Compressor / Decompressor which is an algorithm used to expand and compress the file. However, the compressibility of a file is dependent upon how much redundant data there actually is in the file. A file with a lot of similar hues will compress more than an image with a wide variety of colors. Compression is dependent upon the amount of redundant data. Compression schemes are judged by the amount that they compress the file, described by the compression ratio. The compression ratio is the ratio of the uncompressed file’s size to the compressed file’s size. Many of the compression schemes claim a ratio of 2:1. While others can only perform 1.25:1. You must be careful companies that claim significantly high compression ratios. You must make sure you are comparing two lossy or two lossless compression codecs. Comparing lossy to lossless is like comparing apples to oranges. To understand this, let’s look at the difference between lossy and lossless compression. Lossy Compression
Lossy compression schemes, such as those used with JPEG images and many of the video formats, do not create an exact replica of the original file after decompression. They loose some of the original data. This may alarm you at first but lossy compression schemes are usually used when the files that are being compressed don’t need the extra data. For example, an image that you display on screen requires less data than a file that you’re going to print. Therefore you can sacrifice some of the data for the sake of a smaller file size. This is also true in the digital video realm. Again, a certain amount of data can be sacrificed without significantly hurting the playback performance. If you decide to use a lossy compression scheme you do have a choice concerning how much data is lost. Most of these schemes allow you to choose a loss rate. For example when you create a JPEG file you can adjust how much data is lost as shown in Figure 5. The same is true if you are creating video snippets. Of the file formats you have read about in this chapter, only JPEG uses lossy compression. If you decide to use JPEG images keep two things in mind. First, after compression, if you ever try to print the JPEG file, more than likely it will look bad. Second, you should keep a back up of your JPEG images in a format that either doesn’t use compression, or that uses lossless compression. Lossless Compression
Lossless compression schemes do not sacrifice data. In fact they create an exact copy of the original file when they are decompressed. Lossless compression schemes are often used with files that need to maintain the highest level of data. Often they are using in the desktop publishing field for printing purposes, where loss of data would be unacceptable. Lossless compression schemes include the TIFF and GIF LZW (Lempel-ZivWelch) and the BMP RLE (Run-Length Encoding) compression schemes. Additionally, the compression used in the new PNG format is lossless. |