CONTENTS
Page List
Guide
A USERS GUIDE TO
BUSINESS
ANALYTICS
A USERS GUIDE TO
BUSINESS
ANALYTICS
AYANENDRANATH BASU
INDIAN STATISTICAL INSTITUTE
KOLKATA, INDIA
SRABASHI BASU
BRIDGE SCHOOL OF MANAGEMENT
GURGAON, INDIA
&
WORLD CAMPUS INSTRUCTOR
THE PENNSYLVANIA STATE UNIVERSITY, USA
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
2016 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Printed on acid-free paper
Version Date: 20160509
International Standard Book Number-13: 978-1-4665-9165-3 (Hardback)
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
To Professor Atindra Mohan Gun
and
To the memory of Professor Bruce G. Lindsay
with a lifetime of gratitude
Contents
This is a book on predictive analytics. If technology was the competitive edge for business during the later part of the 20th century, for the 21st century it is going to be knowledge. Easy availability of technology at a fraction of the cost compared to what it was in the 1990s, has made all businesses hungry for data. E-commerce, e-business, e-transactions as well as less technologyintensive methods of doing business now generate a plethora of data. Information technology (IT) and ITenabled services (ITeS) are at the core of all major and minor players in every domain. Automation produces an ever-expanding universe of data, spiraling and re-orienting itself, and producing various patterns in the data.
Before getting into the subject matter, we want to emphasize a personal preference which will have some effect on the style followed in this book. The primary subject matter here deals with data and knowledge. In Latin, data is the plural of datum; however, this Preface is perhaps the only occasion when we will use the word datum in this book. Most style guides have now come to accept the use of the noun data with the singular verb. We do acknowledge that the use of the plural verb with data is still the correct grammatical usage. However, modern usage often treats the word data as a collective noun taking a singular verb. As practitioners, we have, over the years, become used to writing the noun data with the singular verb; this, we believe, is true for many other practitioners as well. This is the expression that we will follow throughout the book. This is just our preference, and does not represent a lack of knowledge of the rules of grammar; neither is it an effort to offend the traditionalists.
Big Data, as the expanding data conglomerate is familiarly known, is a treasure trove of information. In the past, complex data mining techniques were used to explore and manipulate the data, by either visualization or intensive mathematical methods. In the main, data mining techniques were the forte of specialists. However, this was before the advent of modern computer technology, which has made real-time capture of a huge volume of data possible. Big data has made traditional data mining methods driven by the specialists obsolete to some extent. Microscopic and labor-intensive data mining techniques are being replaced by automated software-driven methods which are being controlled and interpreted by analysts. Analytics seems to be the inevitable tsunami that is slated to inundate the field of business intelligence.
In all fields where business intelligence plays a role, competition is being driven by data analytics. Take, for example, the case of revenue management in the hospitality or aviation industries. At precisely which time point how many rooms or seats would be made available and at what price level, for how long should those levels hold or for loyal customers what should be the optimum discount offered these decisions were previously controlled by account managers on a case by case basis. Not any more! Now the whole system is automated and an optimized revenue management suite is in place which, potentially, can update the system in real time on the basis of each sold or canceled seat or room so that a customer gets the best possible deal in booking over the web instantaneously. Automated optimized web-based systems are the rule of the game.
In all spheres of business, intelligence is being built in to extract granular level information to drive competition. Marketing departments would like to put customers in well-defined groups for targeted marketing efforts rather than blindly put everybody in the same basket and incur avoidable loss. As communication channels and media exposure are expanding at an exponential rate, more effective and tailored contents are required to reach potential customers with a higher probability of acceptance of the products and to retain them. Insight into customer preference must now be data-driven and customer relationship management must be leveraged using choices of customers which are expressed through their product purchase pattern and feedback. To steer ahead of competition, not only a very large database must be maintained, it should also be accessed in real time for knowledge development and corresponding actions.
The message here is very clear. Each and every business-related decision and action is now thoroughly rooted in data. Historical data is collected and examined from all angles for knowledge gain. This is now routed through predictive modeling for further enhancement of business. Extraction of knowledge from observed facts, therefore, cannot remain in the domain of specialized experts only. An effective business analyst needs to understand the stories spelled out in data. Whatever be her subject matter expertise, she needs to understand the analytical logic behind the recommendation made by the software because, whenever big data is involved, software is heavily depended upon to manipulate the data. Nevertheless, knowledge extraction is not merely processing and manipulation of historical data.