Data Science for Business
Foster Provost
Tom Fawcett
Beijing Cambridge Farnham Kln Sebastopol Tokyo
Praise
A must-read resource for anyone who is serious about embracing the opportunity of big data.
Craig VaughanGlobal Vice President at SAP
This timely book says out loud what has finally become apparent: inthe modern world, Data is Business, and you can no longer thinkbusiness without thinking data. Read this book and you willunderstand the Science behind thinking data.
Ron BekkermanChief Data Officer at Carmel Ventures
A great book for business managers who lead or interact with datascientists, who wish to better understand the principals andalgorithms available without the technical details ofsingle-disciplinary books.
Ronny KohaviPartner Architect at Microsoft Online Services Division
Provost and Fawcett have distilled their mastery of both the art andscience of real-world data analysis into an unrivalled introduction to thefield.
Geoff WebbEditor-in-Chief of Data Mining and KnowledgeDiscovery Journal
I would love it if everyone I had to work with had read this book.
Claudia PerlichChief Scientist of M6D (Media6Degrees) and Advertising Research Foundation Innovation Award Grand Winner (2013)
A foundational piece in the fast developing world of Data Science. A must read for anyone interested in the Big Data revolution."
Justin GapperBusiness Unit Analytics Manager at Teledyne Scientific and Imaging
The authors, both renowned experts in data science before it had aname, have taken a complex topic and made it accessible to all levels,but mostly helpful to the budding data scientist. As far as I know,this is the first book of its kindwith a focus on data scienceconcepts as applied to practical business problems. It is liberallysprinkled with compelling real-world examples outlining familiar,accessible problems in the business world: customer churn, targetedmarking, even whiskey analytics!
The book is unique in that it does not give a cookbook of algorithms,rather it helps the reader understand the underlying concepts behind datascience, and most importantly how to approach and be successful atproblem solving. Whether you are looking for a good comprehensiveoverview of data science or are a budding data scientist in need ofthe basics, this is a must-read.
Chris VolinskyDirector of Statistics Research at AT&T Labs and Winning Team Member for the $1 Million Netflix Challenge
This book goes beyond data analytics 101. Its the essential guidefor those of us (all of us?) whose businesses are built on theubiquity of data opportunities and the new mandate for data-drivendecision-making.
Tom PhillipsCEO of Media6Degrees and Former Head of Google Search and Analytics
Intelligent use of data has become a force powering business to newlevels of competitiveness. To thrive in this data-driven ecosystem,engineers, analysts, and managers alike must understand the options,design choices, and tradeoffs before them. With motivating examples,clear exposition, and a breadth of details covering not only the howsbut the whys, Data Science for Business is the perfect primer forthose wishing to become involved in the development and application ofdata-driven systems.
Josh AttenbergData Science Lead at Etsy
Data is the foundation of new waves of productivity growth,innovation, and richer customer insight. Only recently viewed broadly asa source of competitive advantage, dealing well with data is rapidlybecoming table stakes to stay in the game. The authors deep appliedexperience makes this a must reada window into your competitorsstrategy.
Alan MurraySerial Entrepreneur; Partner at Coriolis Ventures
One of the best data mining books, which helped me think throughvarious ideas on liquidity analysis in the FX business. The examplesare excellent and help you take a deep dive into the subject! Thisone is going to be on my shelf for lifetime!
Nidhi KathuriaVice President of FX at Royal Bank of Scotland
Special Upgrade Offer
If you purchased this ebook directly from oreilly.com, you have the following benefits:
DRM-free ebooksuse your ebooks across devices without restrictions or limitations
Multiple formatsuse on your laptop, tablet, or phone
Lifetime access, with free updates
Dropbox syncingyour files, anywhere
If you purchased this ebook from another retailer, you can upgrade your ebook to take advantage of all these benefits for just $4.99. to access your ebook upgrade.
Please note that upgrade offers are not available from sample content.
Preface
Foster Provost
Tom Fawcett
Data Science for Business is intended for several sorts of readers:
- Business people who will be working with data scientists, managing data scienceoriented projects, or investing in data science ventures,
- Developers who will be implementing data science solutions, and
- Aspiring data scientists.
This is not a book about algorithms, nor is it a replacement for a book aboutalgorithms. We deliberately avoided an algorithm-centered approach.We believe there is a relatively small set of fundamental concepts orprinciples that underlie techniques for extracting useful knowledge fromdata. These concepts serve as the foundation for many well-known algorithmsof data mining. Moreover, these concepts underlie the analysis ofdata-centered business problems, the creation and evaluation of data sciencesolutions, and the evaluation of general data science strategies andproposals. Accordingly, we organized the exposition around these generalprinciples rather than around specific algorithms. Where necessary todescribe procedural details, we use a combination of text and diagrams, whichwe think are more accessible than a listing of detailed algorithmic steps.
The book does not presume a sophisticated mathematical background. However,by its very nature the material is somewhat technicalthe goal is to impart asignificant understanding of data science, not just to give a high-leveloverview. In general, we have tried to minimize the mathematics and make theexposition as conceptual as possible.
Colleagues in industry comment that the book is invaluable for helping toalign the understanding of the business, technical/development, and datascience teams. That observation is based on a small sample, so we are curiousto see how general it truly is (see!). Ideally, we envision a book that anydata scientist would give to his collaborators from the development orbusiness teams, effectively saying: if you really want to design/implementtop-notch data science solutions to business problems, we all need to have acommon understanding of this material.
Colleagues also tell us that the book has been quite useful in an unforeseen way: for preparing to interview data science job candidates. The demand from business for hiring data scientists is strong and increasing. In response, more and more job seekers are presenting themselves as data scientists. Every data science job candidate should understand the fundamentals presented in this book. (Our industry colleagues tell us that they are surprised how many do not. We have half-seriously discussed a follow-up pamphlet Cliffs Notes to Interviewing for Data Science Jobs.)