1. Overview and State-of-the-Art of Uncertainty Visualization
Abstract
The goal of visualization is to effectively and accurately communicate data. Visualization research has often overlooked the errors and uncertainty which accompany the scientific process and describe key characteristics used to fully understand the data. The lack of these representations can be attributed, in part, to the inherent difficulty in defining, characterizing, and controlling this uncertainty, and in part, to the difficulty in including additional visual metaphors in a well designed, potent display. However, the exclusion of this information cripples the use of visualization as a decision making tool due to the fact that the display is no longer a true representation of the data. This systematic omission of uncertainty commands fundamental research within the visualization community to address, integrate, and expect uncertainty information. In this chapter, we outline sources and models of uncertainty, give an overview of the state-of-the-art, provide general guidelines, outline small exemplary applications, and finally, discuss open problems in uncertainty visualization.
1.1 Introduction
Visualization is one window through which scientists investigate, evaluate and explore available data. As technological advances lead to better data acquisition methods, higher bandwidth, fewer memory limits, and greater computational power, scientific data sets are concurrently growing in size and complexity. Because of the reduction of hardware limitations, scientists are able to run simulations at higher resolution, for longer amounts of time, using more sophisticated numerical models. These advancements have forced scientists to become increasingly reliant on data processing, feature and characteristic extraction, and visualization as tools for managing and understanding large, highly complex data sets. In addition, there is becoming a greater accessibility to the error, variance, and uncertainty not only in output results but also incurred throughout the scientific pipeline.
With increased size and complexity of data becoming more common, visualization and data analysis techniques are required that not only address issues of large scale data, but also allow scientists to understand better the processes that produce the data, and the nuances of the resulting data sets. Information about uncertainty, including confidence, variability, as well as model bias and trends are now available in these data sets, and methods are needed to address the increased requirements of the visualization of these data. Too often, these aspects remain overlooked in traditional visualization approaches; difficulties in applying pre-existing methods, escalating visual complexity, and the lack of obvious visualization techniques leave uncertainty visualization an unsolved problem.
Effective visualizations present information in a manner that encourages data understanding through the appropriate choice of visual metaphor. Data are used to answer questions, test hypotheses, or explore relationships and the visual presentation of data must facilitate these goals. Visualization is a powerful tool allowing great amounts of data to be presented in a small amount of space, however, different visualization techniques are better than others for particular types of data, or for answering specific questions. Using the most befitting visualization method based on the data type and motivated by the intended goals of the data results in a powerful tool for scientists and data analysts.
The effective visualization of uncertainty, however, is not always possible through the simple application of traditional visualization techniques. Often, the visualization of the data itself has a high visual complexity, and the addition of uncertainty, even as a scalar value, complicates the display. Issues of visual clutter, data concealment, conflicts in how the data and the uncertainty are represented, and unintentional biases are just some of the problems incurred when visualizing data accompanied by uncertainty. Also, the complexity of these data sets may not lend themselves to the straightforward application of existing visualization methods, and thus, the added burden of uncertainty can be overwhelming.
Uncertainty data are becoming more prevalent and can be found in fields such as medical imaging, geoscience, and mechanical engineering. The simulation of complex systems, compilation of sensor data, and classification of tissue type are but a few sources of uncertainty data and their expression, size, and complexity can drastically vary. Uncertainty can arise in all stages of the analysis pipeline, including data acquisition, transformation, sampling, quantization, interpolation, and visualization. It can be a single scalar value presented alongside the original data, or can be an integral aspect of the data, derived from the description of the data itself. In any case, uncertainty is an imperative component of scientific data sets and should not be disregarded in visualizations.
1.1.1 Sources of Uncertainty
Uncertainty can mean very different things in different situations, with each driven by different key characteristics and goals. The uncertainty in a data set may result from the process through which the data was gathered or generated, or it may represent variability in the phenomenon represented by the data. We divide data uncertainty sources into three broad classes: uncertainty observed in sampled data, uncertainty measures generated by models or simulations, and uncertainty introduced by the data processing or visualization processes. Variability in the underlying phenomenon could manifest itself in sampled data or be incorporated into models or simulations. A particular data set might be subject to one form of uncertainty or multiple. Different types of uncertainty offer different challenges to effective and truthful visualization. While most of the visualization literature about uncertainty concentrates on issues of visual representation rather than source, a few papers have made a thoughtful analysis of the source of uncertainty, as well []. The discussion below draws from all these sources.