Table of Contents
The Art of Image Processing
Amongst all the studies of natural phenomena, light most delights its students
Leonardo da Vinci
About this book
This book is about Image Processing.
It is the first in a short series. This one discusses image processing for photographs. It explains how digital cameras work, and the basic processing that leads to a pleasing image being displayed for viewing.
Digital camera, photo processing and display technology have their base in the realistic painting of the Renaissance, the later exploration of color and shade by the Impressionists, and investigations into the science of human vision late in the Age of Enlightenment. These origins tend to be obscured by the technical detail but they are encoded into the standards and the processing tools and technology. In this book I relate these artistic and scientific origins to the modern technology that has developed from them.
Renaissance realism was concerned with rendering scenes that could be seen - things that you could have looked at, captured with the aim of being viewed. Leonardo da Vinci described realistic painting as:
a subtle invention, which with philosophical and subtle speculation considers all manner of forms: sea, land, trees, animals, grasses, flowers, all of which are enveloped in light and shade
and this book focuses on image processing whose aim is to produce a pleasing picture that is obviously of some thing.
I have deliberately avoided a mathematical treatment and left the topics of compression - which is unavoidably mathematical - to two later books. As with other books in the Art of titles, this eBook written specifically for the Amazon Kindle is meant to be read as you would a novel: starting at the beginning, reading through in sequence, following the narrative. Not stopping to make notes, nor working through exercises or assignments, just reading it for enjoyment and to gain a first understanding of the subject.
The highly visual and colorful nature of the subject means you will gain much more from viewing this book on a Kindle viewer that supports color.
There are two other books in this series:
- Still image compression
- Digital video compression
and one companion book about the related topic of Digital Signal Processing:
eBook live
The material in this book can be presented by the author as a 4 or 5-day industrial short course, on site. For details visit the BORES Signal Processing web site:
www.bores.com
I have seized the light. I have arrested its flight.
Louis Daguerre (inventor of photography)
Images
Talking about digital images can be difficult because so much of our language uses visual metaphors. For example at first sight the function of a digital camera seems straightforward: a digital camera takes a picture. But if you look closer you will see that the statement in fact raises some quite subtle issues.
Seeing things
First, without thinking any more about cameras yet, look at the previous sentence. Notice how much visual imagery it contains: at first sight, look closer, you will see. Our language is full of visual metaphors. Our visual system is amazingly clever and we understand seeing so intuitively that when we talk we often refer to vision as a model to make ourselves understood. This illustrates an important point, from a certain point of view, if you see what I mean.
When we talk about vision, much of what we say uses vision itself as a metaphor. And you can't explain something in words very well if the words you use just refer back to the thing you are explaining. The language to reason about images is itself based on imagery. If I want to explain some aspect of an image processing problem then I might choose something to 'illustrate' the problem. Or I might ask you to try to 'see' the problem from a particular 'point of view'.
These visual analogies are helpful because they derive from some strong intuitive understanding, but they can confuse us by using visual metaphors to describe concrete visual things, and they refer to understanding but do not actually explain it.
Since we already know too much (some of it wrong), it is necessary and helpful to try to define the meaning and scope of terms that may seem obvious, and to be careful about how we use them. Seeing is something we do, not something we understand. It is very complex and subtle. We understand vision intuitively but find it harder to describe verbally. We need to be very clear and logical: to define what we mean by words like picture, scene and image; to outline the scope of our discussion; and to be concrete - an image projected by a lens, or a picture of something.
Pictures and images
Searching for clear definitions can take us in a circle (for instance Google defines a picture as a visual representation, a representation as an image and an image as a picture):
If I say: a digital camera takes a picture then we all know what a picture is. But to explain or define a picture is hard without saying it is a 'visual representation' or an 'image' or using any other visual metaphor that is just taking us in circles by saying that 'a picture is a picture'.
Take this further. What is the picture that the camera takes? Perhaps we can refine this by extending the statement to say that the camera takes a picture of something. So if I frame part of a view in the viewfinder, is that part of the view the picture? Probably not I might say that view would make a nice picture but not talk of its being the picture. I might say the view is a picture, but that is speaking metaphorically to suggest that the view is as pretty as a picture. Perhaps the picture is the image projected by the camera lens onto a surface? (Notice that I am not yet sure what an image is..). That could indeed be a picture but it would in fact not look very like one the camera's sensor array is designed to absorb light not reflect it so I would probably not see much. So is the picture the pattern of light that would have been projected if the sensor array had instead been a little screen? Or is the picture the stored result in memory of reading the sensor array? I can't see the memory contents, and something I can't see does not feel like a picture. Perhaps the picture is the result of taking the stored array and displaying it? Personally I feel this is getting close a picture is intrinsically something that is looked at (unless you are into conceptual modern art). But now the display medium becomes important. The picture will look different if it is displayed on a small camera preview screen, a big wide-screen TV, a tablet, a stadium-sized plasma screen, or is printed on paper (photos may look good on screen but awful when printed).
I think the way through this is to see that a picture is inextricably linked with its being displayed: so any reasoning about it should take into account the process of rendering it onto the display medium. Another word might be used to mean the physical pattern of light focussed by the lens onto the sensors and that word might be the image. When digitized this could be the digital image. The image is of the part of the scene that is focussed onto the sensor by the camera lens. This is supposedly the same as what is seen through the frame of the viewfinder, so we could use the word frame to suggest the physically real part of the scene whose image is projected (frame being commonly used to describe the composition of a photograph and of course also as in a picture frame). Which leaves the scene to be the overall world view that the camera's human user experiences.