BIG DATA
Big Data is everywhere. It shapes our lives in more ways than we know and understand. This comprehensive introduction unravels the complex terabytes that will continue to shape our lives in ways imagined and unimagined.
Drawing on case studies like Amazon, Facebook, the FIFA World Cup and the Aadhaar scheme, this book looks at how Big Data is changing the way we behave, consume and respond to situations in the digital age. It looks at how Big Data has the potential to transform disaster management and healthcare, as well as prove to be authoritarian and exploitative in the wrong hands.
The latest offering from the authors of Artificial Intelligence: Evolution, Ethics and Public Policy , this accessibly written volume is essential for the researcher in science and technology studies, media and culture studies, public policy and digital humanities, as well as being a beacon for the general reader to make sense of the digital age.
Saswat Sarangi\ is a theoretical physicist by training with a PhD from Cornell University, Ithaca, NY, USA. After his PhD, he was a research scientist at Columbia University, New York City, USA. Saswat started his finance career in New York City working as a quant at Bloomberg and later at Citigroup. He currently works with Invesco in Atlanta, USA.
Pankaj Sharma is an engineer from IIT Kharagpur, India, with an MBA from the Faculty of Management Studies, University of Delhi, India. He has more than 15 years of diverse work experience in various leadership roles with global investment banks, Indian equity brokerages, state-owned enterprises and start-ups. Pankaj turned full-time researcher in late 2016 to do in-depth work on contemporary issues. Earlier, he was a ranked equity analyst with UBS, Citi and JP Morgan. Pankaj has also been a regular contributor to print and electronic media. Pankaj published his first two books in 2017: Demonetization: Modis Political Masterstroke and 2019: Will Modi Win? This was followed by Rafale, Raga, Reuniting Forces for 2019 in the second half of 2018 and The Anatomy of an Indian General Election in early 2019.
BIG DATA
A Beginners Introduction
Saswat Sarangi and Pankaj Sharma
First published 2020
by Routledge
2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN
and by Routledge
52 Vanderbilt Avenue, New York, NY 10017
Routledge is an imprint of the Taylor & Francis Group, an informa business
2020 Saswat Sarangi and Pankaj Sharma
The right of Saswat Sarangi and Pankaj Sharma to be identified as authors of this work has been asserted by them in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers.
Trademark notice : Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-in-Publication Data
A catalog record has been requested for this book
ISBN: 978-1-138-59857-7 (hbk)
ISBN: 978-0-367-14890-4 (pbk)
ISBN: 978-0-429-33079-7 (ebk)
Typeset in Bembo
by Swales & Willis Ltd, Exeter, Devon, UK
For Richard P. Feynman, there may not be anyone like him again!
Another mistaken notion connected with the law of large numbers is the idea that an event is more or less likely to occur because it has or has not happened recently. The idea that the odds of an event with a fixed probability increase or decrease depending on recent occurrences of the event is called the gamblers fallacy. That does not happen. For what its worth, a good streak doesnt jinx you, and a bad one, unfortunately, does not mean better luck is in store.
Leonard Mlodinow, The Drunkards Walk: How Randomness Rules Our Lives
I was always late to join the social media bandwagon and it was only about a couple of years ago that I was finally on Facebook and then too sporadically. Somehow the idea of sharing even the most mundane of things happening in your life was not that appealing to me. However, it is always a personal choice and diversity is what makes this world so interesting. Nevertheless, I was so fascinated with the impact of so much data sharing on the individuals and also on society collectively that it became more and more interesting to read and know about.
Space travel to Mars, an inevitable victory of renewable over conventional energy, superhuman artificial intelligence, driverless cars and so many other things which were almost unimaginable even ten years ago are now either reality or close enough. There is just so much happening in the technological space that it is impossible to keep pace. But we dont think keeping pace even matters. We are witnessing so much change and at such a fast pace that this ride in itself is exhilarating. Looking at it closely and soaking up the experience is the next best thing to actually being at the forefront of the development of technology frontiers.
Over the course of researching and writing our previous book, Artificial Intelligence Evolution, Ethics and Public Policy , we became more and more convinced that Big Data, which is one of the important foundation pillars for the development of AI (artificial intelligence), is an important enough subject in itself, with a lot to discover and understand about this technology and its phenomenal applications. When Aakash Chakrabarty of Taylor & Francis offered us the chance to write about it, we knew we were already passionate about Big Data and wanted to write a book for beginners about it.
The last 50 years especially after the advent of the personal computer, the ubiquitous Internet and now with smartphones and information technology (IT) in conjunction with rapid developments in communication technology have fundamentally changed how we live, how we buy, how we eat, how we play, how we make friends and how we source our news and entertainment. The information and communication technology advances have also altered the ways and means of how we interact with each other. We may have become better or worse that will be subject to a discussion and numerous individual opinions but there is absolute certainty that we have become different.
This also means that people are generating more data in each subsequent timeframe and these numbers are mind-boggling. But, numbers or data in isolation dont mean much and, wherever there is data, analytics follow to make sense of it. However, when the quantum of data is extremely large and when the data is also unstructured and raw, the conventional tools of data storage and data analysis wont work and will not be very useful. This makes Big Data and Big Data Analytics different from other statistical analyses.
Big Data can be leveraged to make many things better in several areas; and how efficiently and effectively we make use of this unstructured information to first make it structured and then to develop insights, is of paramount importance for the future of humanity. With even bigger challenges, such as seemingly irreversible demographic changes in the world, the worsening climate change situation, the shift in dominant sources of energy and more usage of information technology and significantly better mediums of communication, Big Data can help us in finding solutions to some of the biggest challenges humanity faces. But the story is not that simple.