• Complain

Matt Wiley and Joshua F. Wiley - Advanced R: Data Programming and the Cloud

Here you can read online Matt Wiley and Joshua F. Wiley - Advanced R: Data Programming and the Cloud full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. publisher: Apress, Berkeley, CA, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Matt Wiley and Joshua F. Wiley Advanced R: Data Programming and the Cloud

Advanced R: Data Programming and the Cloud: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Advanced R: Data Programming and the Cloud" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Matt Wiley and Joshua F. Wiley: author's other books


Who wrote Advanced R: Data Programming and the Cloud? Find out the surname, the name of the author of the book and a list of all author's works by series.

Advanced R: Data Programming and the Cloud — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Advanced R: Data Programming and the Cloud" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Matt Wiley and Joshua F. Wiley 2016
Matt Wiley and Joshua F. Wiley Advanced R 10.1007/978-1-4842-2077-1_1
1. Programming Basics
Matt Wiley 1 and Joshua F. Wiley 1
(1)
Elkhart Group Ltd. & Victoria College, Columbia City, Indiana, USA
Electronic supplementary material
The online version of this chapter (doi: 10.1007/978-1-4842-2077-1_1 ) contains supplementary material, which is available to authorized users.
As with most languages, more advanced usage requires delving into the underlying structure. This chapter covers such programming basics, and this first section of the book (through Chapter ), develops some advanced programming techniques. We start with Rs basic building blocks, which create our foundation for programming, data management, and cloud analytics.
Before we dig too deeply into R, some general principles to follow may well be in order. First, experimentation is good. It is much more powerful to learn hands-on than it is simply to read. Download the source files that come with this text, and try new things!
Second, it can help quite a bit to become familiar with the ? function. Simply type ? immediately followed by text in your R console to call up help of some kind. We cover more on functions later, but this is too useful to ignore until that time.
Finally, just before we dive into the real reason you bought this book, a word of caution: this is an applied text. There may be topics and areas of R we skip or ignore. While we, the authors, like to imagine this is due to careful pruning of ideas, it may well be due to ignorance. There are likely other ways to perform these tasks or additional good topics to learn. Our goal is to get you up and running as quickly as possible toward some useful skills. Good luck!
Advanced R Software Choices
This book is written for advanced users of the R language. We should note that for most of our examples, we continue using RStudio ( www.rstudio.com/products/rstudio/download/ ) as in Beginning R: An Introduction to Statistical Programming (Apress, 2015). We also assume you are using a Microsoft Windows ( www.microsoft.com ) operating system, except for the later chapters, where we delve into using R in the cloud via Ubuntu ( www.ubuntu.com ). What is different is the underlying R distribution.
We are going to use Microsoft R Open (MRO) , which is fully aligned with the current version(s) of R. This provides performance enhancements that happen behind the scenes. We also use Intel Math Kernel Library (Intel MKL) , which is available for download at the same site as MRO ( https://mran.microsoft.com/download/ ) . In fact, as this book goes to print, these two software programs combined in their latest release. It would be wonderful if that trend continues. These downloads are very straightforward, and we anticipate that our readers, familiar with using R and RStudio already, find this a seamless installation. On Windows (and Linux-based operating systems), the MKL replaces the default linear algebra system with an optimized system and allows implicit parallel processing for linear algebra operations, such as matrix multiplication and decomposition that are used in many statistical algorithms.
In case it is not already, you also need Java installed. We used Java Version 8 Update 91 for 64 bit in this book. Java may be downloaded at www.oracle.com/technetwork/java/javase/ ; specifically, get the Java Development Kit (JDK ).
While these choices may have minor consequences, our goal is to provide universal guidance that remains true enough regardless of environmental specifics. Nevertheless, some packages and prebuilt functions on occasion have quirks. We turn our attention to ensuring that you can readily reproduce our results.
Reproducing Results
One useful feature of R is the abundance of packages written by experts worldwide. This is also potentially the Achilles heel of using R: from the version of R itself to the version of particular packages, lots of code specifics are in flux. Your code has the potential to not work from day to day, let alone our code written months before this book was published. To solve this, we use the Revolution Analytics checkpoint package (Microsoft Corporation, 2016), which uses server-stored snapshots from the Comprehensive R Archive Network (CRAN) to lock our code to a specific version and date. To learn the technical specifics of how this is done, visit the link in the References section at the end of this chapter. Well get you started with the basics.
For this book, we used R version 3.3.1, Bug in Your Hair, along with Windows 10 Professional x64. As this version moves from the current version to historical, CRAN maintains an archive of past releases. Thus, the checkpoint package has ready access to previous versions of R, and indeed all packages. What you need to do is add the following code to the top of your Chapter R file in your project directory:
## uncomment to install the checkpoint package
## install.packages("checkpoint")
library(checkpoint)
checkpoint("2016-09-04", R.version = "3.3.1")
library(data.table)
We place all library calls at the start of each chapters project file, after the call to the checkpoint library. By including the date of September 4, 2016, we ensure that the latest version of all packages up to that cutoff is installed and run by checkpoint . The first time it is run, after asking permission, checkpoint creates a folder to host the needed versions of the packages used. Thus, as long as you start each chapters code file with the correct library calls, you use the same versions of the packages we use.
Types of Objects
First of all, we need things to build our language, and in R, these are called objects . We start with five very common types of objects.
Logical objects take on just two values: TRUE or FALSE . Computers are binary machines, and data often may be recorded and modeled in an all-or-nothing world. These logical values can be helpful, where TRUE has a value of , and FALSE has a value of :
TRUE
[1] TRUE
FALSE
[1] FALSE
As you may remember from the quickly muttered comments of your algebra professor, there are many types, or flavors, of numbers. Whole numbers, which include zero as well as negative values, are called integers . In set notation, {,-2, -1, 0, 1, 2, }, these numbers are helpful for headcounts or other indexes (as well as other things, naturally). In R, integers have the capital L suffix. If decimal numbers are needed, then double numeric objects are in order. These are the numbers suited for even-ratio data types. Complex numbers have useful properties as well and are understood precisely as you might expect, with an i suffix on the imaginary portion. R is quite friendly in using all of these numbers, and you simply type in the desired numbers (remember to add the L or i suffix as needed):
42L
[1] 42
1.5
[1] 1.5
2+3i
[1] 2+3i
Nominal-level data may be stored via the character class and is designated with quotation marks:
"a" ## character
[1] "a"
Of course, numerical data may have missing values. These missing values are of the type that the rest of the data in that set would be (we discuss data storage shortly). Nevertheless, it can be helpful to know how to hand-code logical, integer, double, complex, or character missing values:
NA
[1] NA
NA_integer_
[1] NA
NA_real_
[1] NA
NA_character_
Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Advanced R: Data Programming and the Cloud»

Look at similar books to Advanced R: Data Programming and the Cloud. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Advanced R: Data Programming and the Cloud»

Discussion, reviews of the book Advanced R: Data Programming and the Cloud and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.