Mathematical Underpinnings of Analytics
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford. It furthers the Universitys objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries
Peter Grindrod 2015
The moral rights of the author have been asserted
First Edition published in 2015
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above
You must not circulate this work in any other form and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2014948658
ISBN 9780191038204
Printed and bound by
CPI Group (UK) Ltd, Croydon, CR0 4YY
Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.
To Dora, Tom, Chris, Jumbly, and Sophie.
He who would learn to fly one day must first learn to stand and walk and runand climb and dance; one cannot fly into flying.
Friedrich Nietzsche
I think Im constantly in a state of adjustment.
Patti Smith
Preface
Starting out from a mathematical standpoint, this book introduces a wide range of the concepts, methods and applications that are current within analytics. As I set out in the Introduction, the topics of analytics and data science within customer-facing sectors are really the practical interface of a much larger field of study that could be termed the mathematics of human behaviour. Accordingly, there are two introductory accounts given there: one explaining the commercial and technological drivers of analytics and its value within the digital economy; the other discussing the evolving face of mathematical modelling, explaining why this field is a natural next step for the applied mathematical sciences. I hope that this book will serve as a text for students wishing to develop their interests in analytics, from both theoretical and practical perspectives, as well as for early career professionals within commercial analytics teams, as a source of background experience, ideas and, benchmarks. My wish is that more and more mathematical sciences graduates will become influential leaders within fields of commercial analytics.
What should be clear is that mathematics is highly differentiating in producing new methods and algorithms, not least in dealing with uncertainties. We might often have a lot of data, but it may contain a lot of errors. We should avoid methods that simply grind out metrics telling us what is there. With mathematical models based on rigorous foundations we may gain insights and ask, What might be there?, What does it mean, what can we do now? or even, What have we not observed?
The theory and examples discussed here reflect my own interests in social networks, peer-to-peer communication, demand behaviour, purchasing behaviour, customer behaviour, social norms, and lifestyle changes. I have selected ideas and applications that are useful to customer-facing businesses, such as retailers and consumer goods manufacturers, e-commerce companies, mobile network operators, digital media and marketing companies, energy distributers, and finance, insurance, health, and leisure providers. Necessarily these are very modern applications and the selection of material here is subjective and personal to me. I make no attempt to review, nor claim to be exhaustive. Data sets associated with the book can be freely accessed online .
At the end of each chapter I have included some personal views on matters relating to the theory and the wider commercial and academic contexts of analytics. This includes advice on a variety of topics that some readers may find useful.
I would like to acknowledge the huge amount of assistance, competitive challenge, and encouragement that my colleagues at Numbercraft, Bloom Agency, Counting Lab, Cignifi, and Quintessa gave me within my research over many years working within diverse areas of analytics, quantitative assessment, forecasting and inference, and modelling under very different types of uncertainty. The time spent working with these teams has been both stimulating and fun.
For a number of years I have collaborated very happily and productively with Des J. Higham. I have gained a lot from him, and benefitted from his enthusiasm, good council, creativity, and sense of humour. He has made me strive to work harder.
The material presented here relied on the efforts of my many colleagues and the co-authors of various parts of the underlying material, especially Andy Briant, Robert Brown, Billiejoe Charlton, Sam Clarke, Alex Craven, Ernesto Estrada, Jon Flitton, Danica Vukadinovic Greetham, Rebecca Gower, Simon Grindrod, Stephen Haben, Chrystalla Hadjipavlou, Richard Hibbert, Gabriela Kalna, Guy Keeling, Milla Kibble, Sharon Kirkham, Dan Klinger, Peter Laflin, James Laurie, Tamsin Lee, Mark Parsons, Nick Rafferty, Alain Reuter, Doug Saddy, Colin Singleton, Alastair Spence, Zhivko Stoyanov, Andrew Lol Tallack, Chris Tandy, Keith Vass, Jonathan Ward and David Muddy Waters.
I would like to thank Clive Bowman for his patient advice and many conversations with me about discrimination, Bayes factors, and much more; and Simon Chandler-Wilde who encouraged me when I returned to academia after years away.
I am indebted to my friend Robert Roy C. Brown who has both tolerated and supported my way of working (and has removed many errors from my thinking), and to Tom Grindrod who has corrected and improved drafts of this manuscript.
Some of the research described here has been supported by the UKs Engineering and Physical Sciences Research Council, through the funding for the Horizon Digital Economy Hub and the Mathematical Underpinnings of The Digital Economy. This allowed me to develop theories and some new relationships with exploiters across a number of customer facing sectors.
Finally, I would like to thank my friends and colleagues in the Mathematical Institute at the University of Oxford, and within the wider maths and analytics communities, for encouraging me to burst into print.
Peter Grindrod
Oxford, April 2014.
In almost every sector of commercial and public endeavour there has been, or there is about to be, a data deluge. The innovation and exploitation, and also the hype, are driven by (a) the availability of data from emerging and converging digital platforms; (b) the increased amount of online and off-line traffic, data collection, and surveillance; (c) the commercial imperatives to create greater value from existing customers and distilled knowledge; and (d) growing open data initiatives. Companies have become more aware of their own data resources, and see the future exploitation of these resources as a strategic path to growth.
Next page