Daniel Borcard , Francois Gillet and Pierre Legendre Use R Numerical Ecology with R 10.1007/978-1-4419-7976-6_1 Springer Science+Business Media, LLC 2011
1. Introduction
Abstract
Although multivariate analysis of ecological data already existed and was being actively developed in the 1960s, it really flourished in the years 1970 and later. Many textbooks were published during these years; among them were the seminal cologie numrique (Legendre and Legendre 1979) and its English translation Numerical Ecology (Legendre and Legendre 1983). The authors of these books unified, under one single roof, a very wide array of statistical and other numerical techniques and presented them in a comprehensive way, not only to help researchers understand the available methods of analyses, but also to explain how to choose and apply them in an ordered, logical way to reach their research goals. Mathematical explanations are not absent from these books, and they provide a precious insider look into the various techniques, which is appealing to readers wishing to go beyond the simple user level.
1.1 Why Numerical Ecology?
Although multivariate analysis of ecological data already existed and was being actively developed in the 1960s, it really flourished in the years 1970 and later. Many textbooks were published during these years; among them were the seminal cologie numrique (Legendre and Legendre ). The authors of these books unified, under one single roof, a very wide array of statistical and other numerical techniques and presented them in a comprehensive way, not only to help researchers understand the available methods of analyses, but also to explain how to choose and apply them in an ordered, logical way to reach their research goals. Mathematical explanations are not absent from these books, and they provide a precious insider look into the various techniques, which is appealing to readers wishing to go beyond the simple user level.
Since then, numerical ecology has become ubiquitous. Every serious researcher or practitioner has become aware of the tremendous interest of exploiting the painfully acquired data as efficiently as possible. Other manuals have been published (e.g. Orlci and Kenkel ). A second English edition of Numerical Ecology was published in 1998, broadening the perspective and introducing numerous methods that were unavailable at the time of the previous editions. The progress continues, and since 1998, many important breakthroughs have occurred. In the present book, we present some of these developments that we consider most important, albeit in a more user-oriented way than in the above mentioned manuals, using the R language. For the most recent methods, we provide explanations at a more fundamental level when we consider it appropriate and helpful.
Not all existing methods of data analysis are addressed in the book, of course. Apart from the most widely used and fruitful methods, our choices are based on our own experience as quantitative community ecologists. However, small sections have sometimes been added to briefly describe other avenues than the main ones, without going into details.
1.2 Why R ?
The R language has experienced such a tremendous development and reached such a wide array of users during the recent years that a justification of its application to numerical ecology is not required. Development also means that more and more domains of numerical ecology are now covered, up to the point where, computationally speaking, some of the most recent methods are actually only available through R packages.
This book is not intended as a primer in R , however. To find that kind of support, readers should consult the CRAN Web page ( http://www.R-project.org ). The link to Manuals provides many free electronic documents and the link to Books many references. Readers are expected to have a minimal working knowledge of the basics of the language, e.g. formatting data and importing them into R , awareness of the main classes of objects handled in this environment (vectors, matrices, data frames and factors), as well as the basic syntax necessary to manipulate, create, and otherwise use objects within R . Nevertheless, Chap. 2 starts at an elementary level as far as multivariate objects are concerned, since these are the main targets of most analyses addressed throughout the book, while not necessarily being most familiar to many users.
The book is by far not exhaustive as to the array of functions devoted to any of the methods. Usually, we present one or several variants, but often other functions are available in R . Centring the book on a small number of well-integrated packages and adding some functions of our own, when necessary, helps users up the learning curve while keeping the amount of package-level idiosyncrasies at a reasonable level. Our choices do not imply that other existing packages are inferior to the ones used in the book.
1.3 Readership and Structure of the Book
The intended audience of this book is the researchers, practitioners, graduate students and teachers who already have a background in general and multivariate statistics and wish to apply their knowledge to their data using the R language, as well as people willing to accompany their learning of the discipline with practical applications. Although an important part of this book follows the organization and symbolism of Legendre and Legendre () and many references to that book are made herein, readers may draw their training from other sources without problem.
Combining an application-oriented book such as this one with a detailed expos of the methods used in numerical ecology would have led to an impossibly long and cumbersome opus. However, all chapters start with a short introduction summarizing its subject matter, to ensure that readers are aware of the scope of the chapter and can appreciate the point of view from which the methods are addressed. Depending on the amount of documentation already existing in statistical textbooks, some introductions are longer than others.
Overall, the book guides readers through an applied exploration of the major methods of multivariate data analysis, as seen through the eye of an ecologist. Starting with some exploratory approaches (). The aims of methods thus range from descriptive to explanatory and to predictive and encompass a wide variety of approaches that should provide readers with an extensive toolbox that can address a wide palette of questions arising in contemporary multivariate ecological analysis.
1.4 How to Use This Book
The book is meant as a companion when working at the computer. The authors pictured a reader studying a chapter by reading the text and simultaneously executing the code. To fully understand the various methods, it is preferable to go through the chapters sequentially, since each builds upon the previous ones. At the beginning of each chapter, an empty R console is assumed to be open. All the necessary data files, the scripts used in the chapters, as well as the R functions and packages that are not available through the CRAN Web site, can be downloaded from a Web page accessible through the Springer Web site ( http://www.springer.com/978-1-4419-7975-9 ). Some of the homemade functions duplicate existing ones, providing alternative solutions (for instance, different or expanded graphical outputs), while others have been written to streamline complex sequences of operations.