Copyright
Acquiring Editor: Andrea Dierna
Editorial Project Manager: Heather Scherer
Project Manager: Punithavathy Govindaradjane
Designer: Russell Purdy
Morgan Kaufmann is an imprint of Elsevier
225 Wyman Street, Waltham, MA 02451, USA
Copyright 2013 Elsevier Inc. All rights reserved
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publishers permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods or professional practices, may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information or methods described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
Sheikh, Nauman Mansoor.
Implementing analytics : a blueprint for design, development, and adoption/Nauman Sheikh.
pages cm
Includes bibliographical references and index.
ISBN 978-0-12-401696-5 (alk. paper)
1. System analysis. I. Title.
T57.6.S497 2013
003dc23
2013006254
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
For information on all MK publications, visit our website at www.mkp.com
Printed and bound in the United States of America
13 14 15 16 17 10 9 8 7 6 5 4 3 2 1
Acknowledgments
I would like to dedicate this book to:
My parents: I am thankful for their life-long sacrifice, prayers, and support to ensure I have a better life than what they had.
Sarah: My wife and my pillar of support. I can barely get ready for work without her help; writing a book wouldve been impossible without her relentless encouragement and effort to provide an environment where I could focus on research and writing.
Sameeha, Abdullah, and Yusuf: My kids for their sacrifice of quality time with dad. They would stand with a book for me to read, a ball to play catch, or a board game, but first check if I was finished writing and never complained if I could not make it.
Jim Rappe and Wayne Eckerson: My fellow professionals who reviewed my work and then encouraged that I should write a book on this topic.
Dr. Fakhar Lodhi: My teacher, long-time mentor, and friend who helped me build a technology-agnostic structure of the entire analytics implementation methodology.
Dr. Sajjad Haider: For his extensive help in researching a wide variety of topics across mathematics, statistics, and artificial intelligence.
Keith Hare, Konrad Kopczynski, and Dr Zamir Iqbal: For reviewing the entire manuscript and providing insightful comments and valuable feedback.
Andrea Dierna: My editor at Morgan Kaufmann who worked patiently with a first-time writer, kept providing valuable feedback, and kept accommodating my missed deadlines.
Author Biography
Nauman Sheikh is a veteran IT professional of 18 years with specialization and focus on data and analytics. His expertise range from data integration and data modeling in operational systems, to multiterabyte data warehousing systems, to analytics driven automated decisioning systems. He has worked in three continents solving data-centric problems in credit, risk, fraud, and customer analytics areas dealing with cultural, technological, and legal challenges surrounding automated decisioning systems. Throughout his career, he has been a firm believer in innovation through simplification to encourage better coordination between technical and business personnel, leading to innovative answers to pressing challenges.
He firmly believes in democratization of analytics and has been working diligently the last few years in building analytics systems using well-known and widely available components. He holds a bachelors degree in computer science from F.A.S.T Institute of Computer Science, Pakistan and lives in Maryland with his wife and three lovely children.
Introduction
), such as where to eat tonight, whether to accept an invitation, if a candidate should be hired, or if a promotional discount will help sales. The more background information available, the better the human minds cognitive ability to make a good decision from a variety of possibilities. Analytics is all about trying to get a computer system to do the same. This book is about analyticshow data is collected and converted into information, how it transcends into knowledge, how that knowledge is used to make decisions, and how to constantly evaluate and improve those decisions.
This is a practitioners handbook on how to plan, design, and build analytics solutions to solve business problems. When we go about our daily lives and conduct our day-to-day activities, we produce and consume large amounts of data thanks to the digital age we live in. Historically, data was always produced in nature like weather systems and crop yields, but its storage, processing, analysis, and decision making were all done in the human mind through insights gleaned over years of experience and observationalthough this wasnt called data in those days. The really smart people did wellthey were the wise and experienced who advised and influenced decisions in families, tribes, and kingdoms. They always learned from their own and others past experiences and extracted insights that were used to make decisions that managed day-to-day activities of households, towns, and governments. Fast forward to modern times, with a proliferation of business conducted through computing, and this acquisition and analysis of data becomes mainstream and enables new ways of looking at data. The evolution of this computing complimented with digital communication exploded the amount of data produced and analyzed, putting the numeric system under severe stress (try counting the zeros in 100s of petabytes leading toward zettabytes) ().