1. Digital Photography
1.1 Introduction
Photography has evolved considerably in the two centuries since its invention. Advances have allowed us to take more sophisticated, accurate photos with less technical knowledge. The introduction of semiconductor image sensors and embedded processors has formed the foundation for the latest set of advances. This book studies the range of technologies that have enabled us to build smart cameras . The move to smart cameras enables three trends, each of which we will analyze through the book and introduce in this chapter: previsualization and autoprevisualization, automated enhancement of photographs, and cameras that produce analytical summaries rather than photographs.
1.2 Previsualization and Autoprevisualization
Digital cameras have changed the face of photography. Both casual and professional photographers can now take better pictures with less effort than ever before. The goal of this book is to outline the technologies that make this possible.
Better pictures require good algorithms. But those algorithms ultimately must make decisions about how to process the picture to get the best result. And best is clearly a subjective criterion. I believe that at the heart of the digital camera revolution is the move from previsualization by the photographer to autoprevisualization by the camera. Previsualization is a term introduced by Ansel Adams []he taught photographers to see in their minds eye how they wanted their photo to look and then determine the proper combination of techniques to achieve that result. Previsualization is a human, artistic endeavor. Cameras cannot make the sort of profound artistic judgments that Ansel Adams did, but they can make choices based on scene characteristics and knowledge about the composition of typical photographs. The result is autoprevisualization, an automation of the photographic art. Todays cameras are not gallery-ready artists, but they often make better decisions than do typical snapshot shooters.
We need some sort of previsualization because photographs require careful construction to give us a useful and interesting representation of a scene. The human visual system operates on profoundly different principles than does a camera. We perform a great deal of processing when we look at something without being the slightest bit awareour eye constantly scans, constantly adjusts for focus and exposure, and continually identifies objects of interest. Capturing an image and looking at the resulting photo are two very different experiences.
Technical applications of digital cameras need previsualization even more. Autonomous automobiles give just one example. These cars rely on cameras to identify both roads and obstacles. These cameras must work reliably under a huge range of environmental conditions. The cars cameras must be able to adjust themselves continually to deliver the information required to safely drive the carthe driver cannot twiddle the knobs to keep the vision system working.
Photographic technology has steadily moved toward simpler processes since its earliest days. The Kodak, a simple box camera made possible by the advent of roll film, helped to establish the snapshot as a tradition; professional photographers were no longer needed to take photos. Film cameras started to add exposure mechanisms in the 1960s and autofocus in the 1970s. But digital image sensors allowed cameras to analyze images before, during, and after capture, making possible a much broader spectrum of optimizations and interventions into the photographic process.
1.3 Enhanced Images
Cameras are physical devices. A number of factors constrain the photograph we can capture of a given scene: lighting, camera position, optics, and sensor characteristics. Film photography gave us some tools with which to manipulate photographs to enhance the image. A photographer could, for example, dodge and burn parts of the print in order to alter the contrast within the image. Digital photography gives us a much broader range of options. Early tools for image manipulation and enhancement naturally emulated the techniques and results of film photography. Increasingly, digital techniques allow us to create images that simply were not possible with film. Focus stacking, for example, allows us to combine several photos in order to create a composite with much greater depth-of-field. High-dynamic range (HDR) algorithms allow us to combine photos with different exposures to create a composite that re-renders the lighting of the scene.
1.4 Beyond Images to Analysis
Computer vision algorithms allow us to move beyond producing a photograph at all. Cameras are widely used to identify people and objects or to analyze and classify their activity. Analysis has some advantages over imagingwhen cameras are used for safety and security, many people are more comfortable knowing that images do not leave the camera. Algorithms can also combine information from multiple cameras to create an even more accurate and complete understanding of a scene. Multiple cameras reduce occlusion and can also provide several views of a subject at multiple resolutions and perspectives.
1.5 Still and Moving Images
One of the interesting side effects of the digital camera resolution is a blurring of the traditional boundary between still and motion picture cameras. In the film era, the two were very different beasts. In the digital era, the differences between the two become much smaller. Virtually all cameras today have some capability to capture both still and moving imagesthey may be better at one than the other, but they can do both. This book will move fluidly between still and video.
1.6 Taking a Picture
To understand just how much modern cameras do for us, let us consider the picture-taking process. The photograph of Fig. is not complicated or a work of art, merely an enjoyable photo. Yet even taking this simple photo required some care and consideration.
Fig. 1.1
An uncomplicated photograph
First, the steps that take place before a still photo are taken:
The camera is positioned to have a chosen view of the subject. The position includes not only the x , y , z position of the camera but also its orientation.
The image is focused on a particular part of the subject.
The required exposure is determined.
Once the photo is actually captured, the camera performs a number of steps, some of which may be optional depending on the sophistication of the camera or the choices made by the photographer:
The scenes white balance is determined to compensate for the different colors produced by different types of light sources.
The image may be sharpened to make it more pleasing.
The image data is compressed, typically with lossy algorithms that throw away some aspects of the image in order to reduce the amount of data required to reproduce the image.
The compressed image data is stored as a file in a storage medium.
The process for video is much the same except that most of these steps must be performed continually: focus, exposure, image enhancement, compression, and storage all require streaming operation.