Open main menu

CDOT Wiki β

BTC640/Images

Revision as of 18:24, 23 November 2011 by Andrew (talk | contribs) (Lecture)

Lecture

Textbook chapter: 3

Images are one of the first multimedia element used on webpages when the web was taking off. But despite the apparent simplicity there is plenty of technical details to learn about them. It's not as simple as "there is an image" vs "there is no image". We will look at various factors that affect the use of images on the web, and various capabilities of image formats.

Raster/Bitmap vs Vector

Roughly speaking raster image contents are described in the file format as "put pixel of this colour here"; vector image contents are described as "draw line of this colour from this relative coordinate to that relative coordinate". As a result a vector image can usually be sized very large without quality loss. If you size up a bitmap even with a modern tool with a smart resizing algorithm the result will look fuzzy, and progressively worse the larger the size.

See for example the Seneca Freedom toaster artwork: it was created as vector graphics (in Adobe Illustrator) and the very large printed version looks very clear and sharp. To make a bitmap of that resolution would be almost impossible given the physical size of the printout and the desired resolution.

Vector images are usually used by professionals. Regular people don't have access to software even to view vector graphics. Vector graphics can be converted to bitmaps for distribution.

A vector graphic can also be actually just a bitmap - for example if the entire image is defined as single points rather than lines and curves. Such a vector graphic is basically a bitmap in a vector format, without any benefits of vector graphics. You typically see SVG icons like this, this has evolved because SVG implementations have been traditionally quite incompatible with each other.

Compression of Raster Images

Some formats are completely uncomressed - a typical example is .bmp where the colour of every single pixel in the image is stored. This wouldn't be a problem if you had 10 or 100 or even 10000 pixels but a typical resolution today is 1920x1080, which is 2073600 (over two million) points. So for practical use of large images compression is needed. This is true not only on the web but in regular software as well, imagine a 10 slide Powerpoint presentation that was 100MB in size read from a slow flash drive.

Number of Colours

The size of the image can be reduced by reducing the number of bytes required per pixel. A PC can effectively display up to 4 bytes per pixel, split as follows:

  • 1 byte for red
  • 1 byte for green
  • 1 byte for blue

That's called 24-bit (a.k.a. true) colour and allows each pixel to be one of 2^24 (over 16 million) colours.

That's great but if you're trying to save space and your image is a scan of a black and white photo, why waste space for colours? For greyscle you can use 8 bits instead of 24 and still reproduce the image perfectly.

There is another way to reduce the number of bytes required without losing all colour, and that's using a palette. Such an image contains a table of colours used in the image. Typically the number of colours in the palette is 256 and can be any mixture of reds, greens, and blues. You can use this technique to minimize the space required for each pixel but keep the overall look of the original. Depending on the number of colours in the original - the result can look identical or quite a bit worse. Photos typically don't look very pretty in 256 colours.

GIF/PNG

These formats provide lossless compression and are great for images with lots of solid colours, for example logos or diagrams or rasterized text.

The alogithm is more complicated than this, but can be understood as follows:

  • For each row
    • For each pixel
      • Record the colour of the pixel
      • Record the number of following pixels of the same colour
      • Skip the number of pixels of the same colour

So if you have a hundred pixels of the same colour in a row you would practically record only two pixels' worth of information for the entire thing.

This compression method is excellent for some types of graphics but is completely ineffective for images with lots of colour change, gradients, or shaddows (for example photographs).

GIF is an old format, and at the time of its development became instantly popular on the web because it allowed images to be transmitted over very slow lines. It has a limitation of a 256 colour palette.

PNG is a newer format that was developed to provide the same benefits that GIF did but offer more flexibility. It is possible to have a paletted or grayscale or true colour PNG file.

JPEG

This format provides lossy compression that is very good for images with lots of colour variation such as photographs.

The algorithm itself is too complicated to explain, just know that the data is stored in 8x8 pixel blocks.

You can compress a JPEG file more or less depending how high the quality is set to. In a way this is similar to palleting where you drop the number of colours, but here the result is a choppier image for lower quality.

Transparency

Most vector image types and some raster image types (GIF & PNG) support transparency. That is an extra colour setting where a pixel is set to be transparent instead of beeing a certain colour.

In a palleted image the transparent colour is actually a real colour that is designated as transparent, and the viewing software makes sure that it does not display that colour. This is how it works for GIF files.

PNG files can have a transparent colour or an alpha channel - that is each pixel can be partially transparent and the viewing software composites (merges) the contents of the image with the contents of the background. This is more expensive in terms of space required but can produce much prettier results.

Converting Between Types

As with most other kinds of data - images can be converted from one type of data to another. This can be done for compression or compatibility purposes.

Converting from one lossless format to another can be done without loss of information (that's what lossless means). Unfortunately it's more complicated than that because each format has different optional capabilities such as transparency and alpha. Here are some table to help you see the bigger picture:

Images with up to 255 colours and a transparent colour:

From/To BMP GIF PNG JPEG
BMP Lossless Lossless Lossless Loss of colour
GIF Loss of transparency flag Lossless Lossless Loss of colour and transparency
PNG Loss of transparency flag Lossless Lossless Loss of colour and transparency
JPEG Lossless Lossless Lossless Lossless

If you have a PNG with an alpha channel: that will be lost when you convert it into any of the above.

Converting from JPEG to one of the other formats is lossless, but re-encoding it as a JPEG is lossy, so converting it back to JPEG will result in a different image than the original.

Degree Students

Some image types used by authors of media rather than consumers allow for multiple layers. One example of this is the native Gimp format - XCF.

The point of using layers is that you can combine them into one image but still make changes to the individual components. This is a very powerful tool mostly used by graphics experts but it's useful even for some simple tasks.

Read the 5 page paper Lossless Re-encoding of JPEG Images Using Block-adaptive Intra Prediction (Matsuda et al, 2008). It's quite complicated but see how much useful information you can extract from it without spending a month learning the math.

Lab