LitArk » Books » Home and family

Shahbaba - Biostatistics with R: an introduction to statistics through biological data

Here you can read online Shahbaba - Biostatistics with R: an introduction to statistics through biological data full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. City: New York i pozostałe, year: 2018, publisher: Springer Science+Business Media, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Biostatistics with R: an introduction to statistics through biological data
Author:
Shahbaba / Babak
Publisher:
Springer Science+Business Media
Genre:
Books / Home and family
Year:
2018
City:
New York i pozostałe
Rating:
3 / 5
Favourites:
Add to favourites
Your mark:
- 60
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Biostatistics with R: an introduction to statistics through biological data: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Biostatistics with R: an introduction to statistics through biological data" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Shahbaba: author's other books

Who wrote Biostatistics with R: an introduction to statistics through biological data? Find out the surname, the name of the author of the book and a list of all author's works by series.

Biostatistics with R: an introduction to statistics through biological data — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Biostatistics with R: an introduction to statistics through biological data" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Babak Shahbaba Use R! Biostatistics with R An Introduction to Statistics Through Biological Data 10.1007/978-1-4614-1302-8_1 Springer Science+Business Media, LLC 2012

1. Introduction

Babak Shahbaba 1

(1)

Department of Statistics, University of California, Irvine, Irvine, CA 92697-1250, USA

Babak Shahbaba

Email:

1.1

1.2

1.3

1.4

1.5

1.6

1.6.1

1.6.2

1.7

1.7.1

1.7.2

1.7.3

1.7.4

1.8

Abstract

This chapter provides a high-level overview of key concepts and typical steps involved in statistical analysis of biological data. Understanding these concepts is essential for learning the topics provided in the remaining parts of this book. More specifically, we talk about hypothesis testing vs. prediction, defining the target population, sampling data, observational studies vs. experiments, and statistical inference. The high-level road map of simple statistical methods discussed in this chapter helps readers to keep the overall objectives of statistical methods and the steps involved in achieving those objectives in mind while moving from one chapter to another. We also introduce the main computational tools, namely R and R-Commander, that are employed to perform statistical analysis throughout this book.

1.1 Statistical Methods in the Context of Scientific Studies

This book discusses statistical methods from the application point of view. More specifically, we focus on biostatistical methods, which involve applying statistical methods to biological and health-related problems. Each section poses one or more practical problems and then presents the statistical tools related to solving these problems. The materials presented in this book cover basic and essential steps involved in analysis of biological and health-related data.

The overall objective of statistical methods is to use empirical evidence in order to improve our knowledge about the target population , which includes the entire group of individuals and objects (e.g., people, plants, cells) we want to study. As a result, statistics helps us to make more informed decisions . We study the population of interest by measuring a set of characteristics (e.g., age, size, weight) that are related to our study. We refer to these characteristics, whose values can change from one member of the population to another one, as variables . The objective of many scientific studies is to learn about the variation of a specific characteristic (variable) in the population of interest. For example, we might be interested in the range of normal body temperature among healthy people, or tumor size in breast cancer patients, or growth rate of walnut tress, or BMI (body mass index) in the US population. In many studies, we want to explain or predict how a variable changes with respect to some other variables. That is, we want to identify possible relationships among different variables. For example, we might want to study the effects of different diets on early growth of chicks, or ask how heart rate changes with body temperature, or whether a higher BMI is associated with higher blood pressure, or whether survival of breast cancer patients depends on the type of treatments (mastectomy vs. breast conservation therapy) they receive. We refer to the variables that are the main focus of our study as the response (or target) variables. In contrast, we call variables that explain or predict the variation in the response variable as explanatory variables or predictors .

Statistical analysis begins with a scientific problem usually presented in the form of a hypothesis testing or a prediction problem. Hypothesis testing refers to the process of examining a scientific statement that explains a phenomenon. In general, hypothesis testing problems can be regarded as decision problems, where we need to decide to accept or reject the proposed explanation for the phenomenon. For example, Mackowiak et al. (1992) [] asked whether the average normal body temperature is the widely accepted value of 98.6F. Their hypothesis was that the average normal body temperature is less than the accepted value. A hypothesis might also be expressed in terms of possible relationships between two or more characteristics. For example, we might hypothesize that the normal body temperature is different between men and women. This means we believe that the body temperature and gender are related. For breast cancer patients, we might hypothesize that mastectomy leads to longer survival of patients compared to those who are treated with breast conservation therapy (lumpectomy, nodal dissection, and radiation).

Statistical methods are used to evaluate a hypothesis based on empirical data. Using these methods, we can decide whether we should reject a hypothesis or not. Such decisions in turn help us to make more informed decisions with respect to the scientific problem that inspired our study. For example, at the conclusion of their study, Mackowiak et al. argue that the average normal body temperature seems to be lower than previously believed, and a new upper limit for the range of normal body temperature should be considered. This recommendation has important consequences for deciding the body temperature set point and whether someone has a fever that requires medication. For treating breast cancer patients, several studies [] have shown that there is no evidence of difference in survival between mastectomy and breast conservation therapy, at least for patients with less severe situations (e.g., small tumors, node negative). Based on these results, The US National Cancer Institute (NCI) recommended breast conservation operations, especially for the type of patients who participated in these studies (i.e., with less sever cancer), instead of mastectomy, which was the standard treatment in the 1960s.

In recent years, high-throughput scientific studies without any clear hypothesis have become very common. For example, scientists may examine thousands of genes with respect to their relationship to a disease without hypothesizing that any specific gene is responsible for the disease. In these studies, the objective is to explore a large number of possible factors (e.g., genes) in order to identify a small number of them for follow-up studies that tend to be more thorough with much smaller scales. Therefore, the initial large-scale studies are not designed for hypothesis testing rather generating a small number of hypotheses, which can be the focus of follow-up studies and tested properly in future.

Scientific problems are sometimes presented as prediction problems. Prediction refers to the process of guessing the value of the response variable using a set of predictors. For example, we might want to predict percent body fat using abdomen circumference, or predict the survival time for cancer patients using tumor size. A large body of the literature in biostatistics is devoted to developing statistical methods for predicting the risk of different diseases such as cancer, Alzheimers disease, diabetes, and Parkinsons disease. Kahn et al. (2009) [] showed that statistical methods can be used to identify patients with Parkinsons disease by detecting dysphonia (an impairment in the normal production of vocal sounds). Predicting unknown outcomes and future events using statistical methods can help us with making better decisions. For example, people with high risk of diabetes might decide to follow preventing measures (e.g., diet).

1.2 Sampling

To answer our scientific questions, we would, ideally, study the entire population of interest (e.g., all breast cancer patients). However, this is usually impossible either physically, ethically, or economically. For example, to test the hypothesis about the average normal body temperature, it is not feasible to record the temperature of all healthy people. Instead, a sample of representative members is selected from the population. Then with the methods of statistical inference , the conclusions based on the sample can cautiously be attributed to the whole population. Mackowiak et al. (1992) selected n =148 people, took their oral temperature, and then made conclusions about the body temperature of the whole population. To compare the effects of different treatments, one of the studies discussed in [] includes 74 women treated by breast conservation therapy and 67 women treated by mastectomy.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Biostatistics with R: an introduction to statistics through biological data»

Look at similar books to Biostatistics with R: an introduction to statistics through biological data. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Alfonso Zamora Saiz

An Introduction to Data Analysis in R: Hands-On Coding, Data Mining, Visualization and Statistics from Scratch

Svend Juul

An Introduction to Stata for Health Researchers

David S. Brown

Statistics and Data Visualization Using R: The Art and Practice of Data Analysis

Steve Figard

Introduction to Biostatistics with JMP

Smoller Jordan W.

Biostatistics and epidemiology: a primer for health and biomedical professionals

Neil J. Salkind

Statistics for People Who (Think They) Hate Statistics

Anderson Alan

Statistics for Big Data for Dummies

Alan Graham

Statistics: An Introduction

Michael Whitlock

The Analysis of Biological Data

B. Burt Gerstman

Basic Biostatistics: Statistics for Public Health Practice

Geoffrey R. Norman

Biostatistics: The Bare Essentials

Gregg Hartvigsen

A Primer in Biological Data Analysis and Visualization Using R

Reviews about «Biostatistics with R: an introduction to statistics through biological data»

Discussion, reviews of the book Biostatistics with R: an introduction to statistics through biological data and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.