1. Live Imaging and Video Bioinformatics
1.1 Introduction
The recent advancements in high-throughput technologies for functional genomics and proteomics have revolutionized our understanding of living processes. However, these technologies, for the most part, are limited to a snapshot analysis of biological processes that are by nature continuous and dynamic. Biologists such as Lichtman (Univ. Washington) and Fraser (CalTech) suggested that bioinformatics based on static images is like learning about a sport by studying a scrapbook []. To determine the rules of American football one can examine 1000 snapshots taken at different times during 1000 gamesbut the rules of the game would probably remain utterly obscure and the role of the halftime marching band would be a mystery. Similarly, to gain a more mechanistic and systematic understanding of biological processes, we need to elucidate cellular and molecular dynamic events (e.g., spatiotemporal changes in protein localization and intracellular signals).
One of the most exciting research developments has been the ability to image molecules and subcellular structures in living cells. Without harming a cell, we can now see and study the complex molecular machinery responsible for the formation of new cells. The imaging field is becoming more precise; for example, the resolution attainable by advanced techniques that break the diffraction limit is of the order of 130 nm []. Multiple imaging modalities can provide 2D ( x , y ) to 5D ( x , y , z , t , wavelength) data since we can image 2D/3D objects for seconds to days to months and at many different wavelengths. This ability, combined with the power of genetics and novel methods for eliminating individual proteins, will answer questions that are centuries old.
To quote Murphy et al. [], The unraveling of the molecular mechanisms of life is one of the most exciting scientific endeavors of the twenty-first century, and it seems not too daring to predict that, within the next decade, image data analysis will take over the role of gene sequence analysis as the number one informatics task in molecular and cellular biology .
The advances in modern visual microscopy coupled with high-throughput multi-well plated instrumentation enable video imaging of cellular and molecular dynamic events from a large number of simultaneous experiments and provide unprecedented opportunities to understand how spatiotemporal dynamic processes work in a cellular/multicellular system []. The application of these technologies is becoming a mainstay of the biological sciences worldwide.
We already are at a point where researchers are overwhelmed by myriads of high-quality videos without proper tools for their organization, analysis, and interpretation. This is the main reason why video data are currently underutilized []. We believe that the next major advance in imaging of biological samples will come from advancements in the automated analysis of multidimensional images. Having tools that enable processes to be studied rapidly and conveniently over time will, like Hookes light microscope and Ruskas electron microscope , open up a new world of analysis to biologists and engineers. The analytical methods will enable the study of biological processes in 5D (3D space, time, frequency/wavelength) in large video databases.
1.2 Video Bioinformatics
Genome sequences alone lack spatial and temporal information, and video imaging of specific molecules and their spatiotemporal interactions, using various imaging techniques, are essential to understand how genomes create cells, how cells constitute organisms, and how errant cells cause disease []. The interdisciplinary research field of Video Bioinformatics is defined by Bir Bhanu as the automated processing, analysis, understanding, data mining, visualization, query - based retrieval/storage of biological spatiotemporal events/data and knowledge extracted from dynamic images and microscopic videos.
The advanced video bioinformatics techniques, fundamental algorithms, and technology will provide quantitative thinking, greater sensitivity, objectivity, and repeatability of life sciences experiments. This will make it possible for massive volumes of video data to be efficiently analyzed, and for fundamental questions in both life sciences and informatics to be answered. The current technology [].
Solving the complex problems described above requires life scientists and computer scientists and engineers to work together on innovative approaches. Computer scientists and engineers need greater comprehension of the biological issues, and biologists must understand the information technology and assumptions made in the development of algorithms and their parameters.
1.3 Integrated Life Sciences and Informatics
Conceptually integrated life sciences/informatics research requires us to perform some of the following sample tasks:
A single moving biological entity (cell, organelle, protein, etc.) needs to be detected, extracted from varying backgrounds and tracked.
The dynamics of deformable shape (local/global changes) of a single entity (not in motion) needs to be analyzed and modeled.
Entities and their component parts need to recognized and classified.
Multiple moving entities need to be tracked and their interaction analyzed and modeled.
Multiple moving entities with simultaneous changes in their global/local shape need to be analyzed and modeled.
The interactions of component parts of an entity and interaction among multiple entities while in motion need to be analyzed and modeled.
Mining of 5D data at various levels of abstractions (e.g., 2D image vs. 1D track) for understanding and modeling of events and detection of anomalous behavior needs to be performed.
The specific computational challenges include algorithmic issues in modeling complex biological motion, segmentation in the presence of complex nonstationary background, elastic registration of frames in the video, complex shape changes, nonlinear movement of biological entities, classification of entities within the cell, recognition in the presence of occlusion, articulation and distortion of shape, adaptation and learning over time, recognition of spatiotemporal events and activities and associated queries and database organization, indexing and search, and computational (space/time) complexity of video processing. A variety of imaging techniques are used to handle spatial resolution from micrometer to millimeter range and temporal resolution from a few seconds to months. The varying requirements of users dictate approaches based on machine learning, rather than handcrafted user-specific solutions to individual problems. As mentioned by Knuth, the noted computer scientist, Biology easily has 500 years of exciting problems to work on [].