In This Book
This book will guide you from being a user of R packages to being a creator of R packages. In . The subsequent chapters go into more detail about each component. Theyre roughly organized in order of importance:
The most important directory is
R/, where your R code lives. A package with just this directory is still a useful package. (And indeed, if you stop reading the book after this chapter, youll have still learned some useful new skills.)The
DESCRIPTION lets you describe what your package needs to work. If youre sharing your package, youll also use the
DESCRIPTION to describe what it does, who can use it (the license), and who to contact if things go wrong.If you want other people (including future you!) to understand how to use the functions in your package, youll need to document them. Ill show you how to use roxygen2 to document your functions. I recommend roxygen2 because it lets you write code and documentation together while continuing to produce Rs standard documentation format.Function documentation describes the nitpicky details of every function in your package. Vignettes give the big picture. Theyre long-form documents that show how to combine multiple parts of your package to solve real problems. Ill show you how to use Rmarkdown and knitr to create vignettes with a minimum of fuss.To ensure your package works as designed (and continues to work as you make changes), its essential to write unit tests that define correct behavior, and alert you when functions break. In this chapter, Ill teach you how to use the testthat package to convert the informal interactive tests that youre already doing to formal, automated tests.To play nicely with others, your package needs to define what functions it makes available to other packages and what functions it requires from other packages. This is the job of the
NAMESPACE file and Ill show you how to use roxygen2 to generate it for you.
NAMESPACE is one of the more challenging parts of developing an R package, but its critical to master if you want your package to work reliably.The
data/ directory allows you to include data with your package. You might do this to bundle data in a way thats easy for R users to access, or just to provide compelling examples in your documentation.R code is designed for human efficiency, not computer efficiency, so its useful to have a tool in your back pocket that allows you to write fast code. The
src/ directory allows you to include speedy compiled C and C++ code to solve performance bottlenecks in your package.You can include arbitrary extra files in the
inst/ directory. This is most commonly used for extra information about how to cite your package, and to provide more details about copyrights and licenses.This chapter documents the handful of other components that are rarely needed:
demo/,
exec/,
po/, and
tools/.
The final three chapters describe general best practices not specifically tied to one directory:
Mastering a version control system is vital for collaborating with others, and is useful even for solo work because it allows you to easily undo mistakes. In this chapter, youll learn how to use the popular Git and GitHub combo with RStudio.R provides useful automated quality checks in the form of
R CMD check
. Running them regularly is a great way to avoid many common mistakes. The results can sometimes be a bit cryptic, so I provide a comprehensive cheat sheet to help you convert warnings to actionable insight.The life cycle of a package culminates with release to the public. This chapter compares the two main options (CRAN and GitHub) and offers general advice on managing the process.
This is a lot to learn, but dont feel overwhelmed. Start with a minimal subset of useful features (e.g., just an R/ directory!) and build up over time. To paraphrase the Zen monk Shunry Suzuki: Each package is perfect the way it isand it can use a little improvement.
Conventions Used in This Book
The following typographical conventions are used in this book:
ItalicIndicates new terms, URLs, email addresses, filenames, and file extensions.
Constant width
Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.
Constant width bold
Shows commands or other text that should be typed literally by the user.
Constant width italic
Shows text that should be replaced with user-supplied values or by values determined by context.
Tip
This element signifies a tip or suggestion.
Note
This element signifies a general note.
Warning
This element indicates a warning or caution.
Using Code Examples
Supplemental material (code examples, exercises, etc.) is available for download at http://r-pkgs.had.co.nz/.
This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless youre reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from OReilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your products documentation does require permission.
We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: