LitArk » Books » Children

Richard A. Berk - Statistical Learning from a Regression Perspective

Here you can read online Richard A. Berk - Statistical Learning from a Regression Perspective full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. publisher: Springer International Publishing, genre: Children. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Statistical Learning from a Regression Perspective
Author:
Richard A Berk
Publisher:
Springer International Publishing
Genre:
Books / Children
Rating:
3 / 5
Favourites:
Add to favourites
Your mark:
- 60
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Statistical Learning from a Regression Perspective: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Statistical Learning from a Regression Perspective" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

This textbook considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response.
This fully revised new edition includes important developments over the past 8 years. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis derives from sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. As in the first edition, a unifying theme is supervised learning that can be treated as a form of regression analysis. Key concepts and procedures are illustrated with real applications, especially those with practical implications.
The material is written for upper undergraduate level and graduate students in the social and life sciences and for researchers who want to apply...

Richard A. Berk: author's other books

Who wrote Statistical Learning from a Regression Perspective? Find out the surname, the name of the author of the book and a list of all author's works by series.

Statistical Learning from a Regression Perspective — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Statistical Learning from a Regression Perspective" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Springer International Publishing Switzerland 2016

Richard A. Berk Statistical Learning from a Regression Perspective Springer Texts in Statistics 10.1007/978-3-319-44048-4_1

1. Statistical Learning as a Regression Problem

Richard A. Berk 1, 2

(1)

Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA, USA

(2)

Department of Criminology, Schools of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA

Richard A. Berk

Email:

Before getting into the material, it may be important to reprise and expand a bit on three points made in the first and second prefaces most people do not read prefaces. First, any credible statistical analysis combines sound data collection, intelligent data management, an appropriate application of statistical procedures, and an accessible interpretation of results. This is sometimes what is meant by analytics. More is involved than applied statistics. Most statistical textbooks focus on the statistical procedures alone, which can lead some readers to assume that if the technical background for a particular set of statistical tools is well understood, a sensible data analysis automatically follows. But as some would say, That dog dont hunt.

Second, the coverage is highly selective. There are many excellent encyclopedic, textbook treatments of machine/statistical learning. Topics that some of them cover in several pages, are covered here in an entire chapter. Data collection, data management, formal statistics, and interpretation are woven into the discussion where feasible. But there is a price. The range of statistical procedures covered is limited. Space constraints alone dictate hard choices. The procedures emphasized are those that can be framed as a form of regression analysis, have already proved to be popular, and have been throughly battle tested. Some readers may disagree with the choices made. For those readers, there are ample references in which other materials are well addressed.

Third, the ocean liner is slowly starting to turn. Over the past decade, the 50 years of largely unrebutted criticisms of conventional regression models and extensions have started to take. One reason is that statisticians have been providing useful alternatives. Another reason is the growing impact of computer science on how data are analyzed. Models are less salient in computer science than in statistics, and far less salient than in popular forms of data analysis. Yet another reason is the growing and successful use of randomized controlled trials, which is implicitly an admission that far too much was expected from causal modeling. Finally, many of the most active and visible econometricians have been turning to various forms of quasi-experimental designs and methods of analysis in part because conventional modeling often has been unsatisfactory. The pages ahead will draw heavily on these important trends.

1.1 Getting Started

As a first approximation, one can think of statistical learning as the muscle car version of Exploratory Data Analysis (EDA). Just as in EDA, the data can be approached with relatively little prior information and examined in a highly inductive manner. Knowledge discovery can be a key goal. But thanks to the enormous developments in computing power and computer algorithms over the past two decades, it is possible to extract information that would have previously been inaccessible. In addition, because statistical learning has evolved in a number of different disciplines, its goals and approaches are far more varied than conventional EDA.

In this book, the focus is on statistical learning procedures that can be understood within a regression framework. For a wide variety of applications, this will not pose a significant constraint and will greatly facilitate the exposition. The researchers in statistics, applied mathematics and computer science responsible for most statistical learning techniques often employ their own distinct jargon and have a penchant for attaching cute, but somewhat obscure, labels to their products: bagging, boosting, bundling, random forests, and others. There is also widespread use of acronyms: CART, LOESS, MARS, MART, LARS, LASSO, and many more. A regression framework provides a convenient and instructive structure in which these procedures can be more easily understood.

After a discussion of how statisticians think about regression analysis, this chapter introduces a number of key concepts and raises broader issues that reappear in later chapters. It may be a little difficult for some readers to follow parts of the discussion, or its motivation, the first time around. However, later chapters will flow far better with some of this preliminary material on the table, and readers are encouraged to return to the chapter as needed.

1.2 Setting the Regression Context

We begin by defining regression analysis. A common conception in many academic disciplines and policy applications equates regression analysis with some special case of the generalized Linear model: normal (linear) regression, binomial regression, Poisson regression, or other less common forms. Sometimes, there is more than one such equation, as in hierarchical models when the regression coefficients in one equation can be expressed as responses within other equations, or when a set of equations is linked though their response variables. For any of these formulations, inferences are often made beyond the data to some larger finite population or a data generation process. Commonly these inferences are combined with statistical tests, and confidence intervals. It is also popular to overlay causal interpretations meant to convey how the response distribution would change if one or more of the predictors were independently manipulated.

But statisticians and computer scientists typically start farther back. Regression is just about conditional distributions. The goal is to understand as far as possible with the available data how the conditional distribution of some response

varies across subpopulations determined by the possible values of the predictor or predictors (Cook and Weisberg 1999: 27). That is, interest centers on the distribution of the response variable Y conditioning on one or more predictors X . Regression analysis fundamentally is the about conditional distributions: Y | X .

For example, Fig. Birthweight can be an important indicator of a newborns viability, and there is reason to believe that birthweight depends in part on the health of the mother. A mothers weight can be an indicator of her health.

Fig. 1.1

Birthweight by mothers weight ( Open circles are the data, filled circles are the conditional means, the solid line is a linear regression fit, the dashed line is a fit by a smoother.

In Fig. , the open circles are the observations. The filled circles are the conditional means and the likely summary statistics of interest. An inspection of the pattern of observations is by itself a legitimate regression analysis. Does the conditional distribution of birthweight vary depending on the mothers weight? If the conditional mean is chosen as the key summary statistic, one can consider whether the conditional means for infant birthweight vary with the mothers weight. This too is a legitimate regression analysis. In both cases, however, it is difficult to conclude much from inspection alone. The solid blue line is a linear least squares fit of the data. On the average, birthweight increases with the mothers weight, but the slope is modest (about 44 g for every 10 pounds), especially given the spread of the birthweight values. For many, this is a familiar kind of regression analysis. The dashed red line shows the fitted values for a smoother (i.e., lowess) that will be discussed in the next chapter. One can see that the linear relationship breaks down when the mother weighs less than about 100 pounds. There is then a much stronger relationship with the result that average birthweight can be under 2000 g (i.e., around 4 pounds). This regression analysis suggests that on the average, the relationship between birthweight and mothers weights is nonlinear.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Statistical Learning from a Regression Perspective»

Look at similar books to Statistical Learning from a Regression Perspective. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Paul J. Mitchell

Experimental Design and Statistical Analysis for Pharmacology and the Biomedical Sciences

YASSINE MOUSAIF

Regression Models for Data Science in R: Statistical inference for data science.

I. Gusti Ngurah Agung

Applications of Quantile Regression of Experimental and Cross Section Data using EViews: Applications on Experimental and Cross Section Data using EViews

Simona Balzano

Statistical Learning and Modeling in Data Analysis: Methods and Applications

Hastie Trevor

An introduction to statistical learning: with applications in R

Miller

Statistics for data science: leverage the power of statistics for data analysis, classification, regression, machine learning, and neural networks

Matt Wiley

Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization

Peter Bruce

Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python

Giuseppe Bonaccorso

Machine Learning Algorithms Popular algorithms for data science and machine learning

Bruce Ratner

Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data, Second Edition

Jussi Klemelä

Multivariate Nonparametric Regression and Visualization: With R and Applications to Finance

Michael S. Lewis-Beck

Data analysis: an introduction, Issue 103

Reviews about «Statistical Learning from a Regression Perspective»

Discussion, reviews of the book Statistical Learning from a Regression Perspective and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.