LitArk » Books » Children

Fan Jianqing - Statistical Foundations of Data Science

Here you can read online Fan Jianqing - Statistical Foundations of Data Science full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2020, publisher: CRC Press LLC, genre: Children. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Statistical Foundations of Data Science
Author:
Fan Jianqing / Li Runze / Zhang CunHui
Publisher:
CRC Press LLC
Genre:
Books / Children
Year:
2020
Rating:
3 / 5
Favourites:
Add to favourites
Your mark:
- 60
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Statistical Foundations of Data Science: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Statistical Foundations of Data Science" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Fan Jianqing: author's other books

Who wrote Statistical Foundations of Data Science? Find out the surname, the name of the author of the book and a list of all author's works by series.

Statistical Foundations of Data Science — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Statistical Foundations of Data Science" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Statistical Foundations of
Data Science

CHAPMAN & HALL/CRC DATA SCIENCE SERIES

Reflecting the interdisciplinary nature of the field, this book series brings together researchers, practitioners, and instructors from statistics, computer science, machine learning, and analytics. The series will publish cutting-edge research, industry applications, and textbooks in data science.

The inclusion of concrete examples, applications, and methods is highly encouraged. The scope of the series includes titles in the areas of machine learning, pattern recognition, predictive analytics, business analytics, Big Data, visualization, programming, software, learning analytics, data wrangling, interactive graphics, and reproducible research.

Published Titles

Feature Engineering and Selection
A Practical Approach for Predictive Models
Max Kuhn and Kjell Johnson

Probability and Statistics for Data Science
Math + R + Data
Norman Matloff

Introduction to Data Science
Data Analysis and Prediction Algorithms with R
Rafael A. Irizarry

Cybersecurity Analytics
Rakesh M. Verma and David J. Marchette

Basketball Data Science
With Applications in R
Paola Zuccolotto and Marcia Manisera

JavaScript for Data Science
Maya Gans, Toby Hodges, and Greg Wilson

Statistical Foundations of Data Science
Jianqing Fan, Runze Li, Cun-Hui Zhang and Hui Zou

For more information about this series, please visit: https://www.crcpress.com/Chapman--HallCRC-Data-Science-Series/book-series/CHDSS

Statistical Foundations of
Data Science

By
Jianqing Fan
Runze Li
Cun-Hui Zhang
Hui Zou

First edition published 2020
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742

and by CRC Press
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

2020 Taylor & Francis Group, LLC

CRC Press is an imprint of Taylor & Francis Group

Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, access

Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

ISBN: 978-1-466-51084-5 (hbk)

Visit the eResource https://www.routledge.com/Statistical-Foundations-of-Data-Science/Fan-Li-Zhang-Zou/p/book/9781466510845

TO THOSE

who educate us and love us;

whom we teach and we love;

with whom we collaborate and associate

Contents

Big data are ubiquitous. They come in varying volume, velocity, and variety. They have a deep impact on systems such as storages, communications and computing architectures and analysis such as statistics, computation, optimization, and privacy. Engulfed by a multitude of applications, data science aims to address the large-scale challenges of data analysis, turning big data into smart data for decision making and knowledge discoveries. Data science integrates theories and methods from statistics, optimization, mathematical science, computer science, and information science to extract knowledge, make decisions, discover new insights, and reveal new phenomena from data. The concept of data science has appeared in the literature for several decades and has been interpreted differently by different researchers. It has nowadays become a multi-disciplinary field that distills knowledge in various disciplines to develop new methods, processes, algorithms and systems for knowledge discovery from various kinds of data, which can be either low or high dimensional, and either structured, unstructured or semi-structured. Statistical modeling plays critical roles in the analysis of complex and heterogeneous data and quantifies uncertainties of scientific hypotheses and statistical results.

This book introduces commonly-used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook on the statistical foundations of data science as well as a research monograph on sparsity, covariance learning, machine learning and statistical inference. For a one-semester graduate level course, it may cover and selected topics from the remaining chapters. This track focuses more on high-dimensional statistics, model selection and inferences but both paths strongly emphasize sparsity and variable selections.

Frontiers of scientific research rely on the collection and processing of massive complex data. Information and technology allow us to collect big data of unprecedented size and complexity. Accompanying big data is the rise of dimensionality, and high dimensionality characterizes many contemporary statistical problems, from sciences and engineering to social science and humanities. Many traditional statistical procedures for finite or low-dimensional data are still useful in data science, but they become infeasible or ineffective for dealing with high-dimensional data. Hence, new statistical methods are indispensable. The authors have worked on high-dimensional statistics for two decades, and started to write the book on the topics of high-dimensional data analysis over a decade ago. Over the last decide, there have been surges in interest and exciting developments in high-dimensional and big data. This led us to concentrate mainly on statistical aspects of data science.

We aim to introduce commonly-used statistical models, methods and procedures in data science and provide readers with sufficient and sound theoretical justifications. It has been a challenge for us to balance statistical theories and methods and to choose the topics and works to cover since the number of publications in this emerging area is enormous. Thus, we focus on the foundational aspects that are related to sparsity, covariance learning, machine learning, and statistical inference.

Sparsity is a common assumption in the analysis of high-dimensional data. By sparsity, we mean that only a handful of features embedded in a huge pool suffice for certain scientific questions or predictions. This book introduces various regularization methods to deal with sparsity, including how to determine penalties and how to choose tuning parameters in regularization methods and numerical optimization algorithms for various statistical models. They can be found in .

High-dimensional measurements are frequently dependent, since these variables often measure similar things, such as aspects of economics or personal health. Many of these variables have heavy tails due to a large number of collected variables. To model the dependence, factor models are frequently employed, which exhibit low-rank plus sparse structures in data matrices and can be solved by robust principal component analysis from high-dimensional covariance. Robust covariance learning, principal component analysis, as well as their applications to community detection, topic modeling, recommender systems, etc. are also a feature of this book. They can be found in . Note that factor learning or more generally latent structure learning can also be regarded as unsupervised statistical machine learning.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Statistical Foundations of Data Science»

Look at similar books to Statistical Foundations of Data Science. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Mario V. Wüthrich

Statistical Foundations of Actuarial Learning and its Applications (2022) [Wüthrich Merz] [9783031124099]

YASSINE MOUSAIF

Regression Models for Data Science in R: Statistical inference for data science.

Benjamin S. Baumer

Modern Data Science with R (Chapman & Hall/CRC Texts in Statistical Science)

Mark Andrews

Doing Data Science in R: An Introduction for Social Scientists

Amar Sahay

Essentials of Data Science and Analytics: Statistical Tools, Machine Learning, and R-Statistical Software Overview

Frank Emmert-Streib

Mathematical Foundations of Data Science Using R

Miller

Statistics for data science: leverage the power of statistics for data analysis, classification, regression, machine learning, and neural networks

Peter Bruce

Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python

Daniel D. Gutierrez [Daniel D. Gutierrez]

Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R

Jake VanderPlas

Python Data Science Handbook: Essential Tools for Working with Data

Manas A. Pathak

Beginning Data Science with R

Nina Zumel

Practical Data Science with R

Reviews about «Statistical Foundations of Data Science»

Discussion, reviews of the book Statistical Foundations of Data Science and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.