The MIT Press Essential Knowledge Series
Auctions, Timothy P. Hubbard and Harry J. Paarsch
Cloud Computing, Nayan Ruparelia
Computing: A Concise History, Paul E. Ceruzzi
The Conscious Mind, Zoltan L. Torey
Crowdsourcing, Daren C. Brabham
Free Will, Mark Balaguer
Information and Society, Michael Buckland
Information and the Modern Corporation, James W. Cortada
Intellectual Property Strategy, John Palfrey
The Internet of Things, Samuel Greengard
Machine Learning: The New AI, Ethem Alpaydn
Memes in Digital Culture, Limor Shifman
Metadata, Jeffrey Pomerantz
The MindBody Problem, Jonathan Westphal
MOOCs, Jonathan Haber
Neuroplasticity, Moheb Costandi
Open Access, Peter Suber
Paradox, Margaret Cuonzo
Robots, John Jordan
Self-Tracking, Gina Neff and Dawn Nafus
Sustainability, Kent E. Portney
The Technological Singularity, Murray Shanahan
Understanding Beliefs, Nils J. Nilsson
Waves, Frederic Raichlen
Machine Learning
The New AI
Ethem Alpaydn
The MIT Press
Cambridge, Massachusetts
London, England
2016 Massachusetts Institute of Technology
All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.
This book was set in Chaparral and DIN by Toppan Best-set Premedia Limited. Printed and bound in the United States of America.
Library of Congress Cataloging-in-Publication Data
Names: Alpaydn, Ethem, author.
Title: Machine learning : the new AI / Ethem Alpaydn.
Description: Cambridge, MA : MIT Press, [2016] | Series: MIT Press essential
knowledge series | Includes bibliographical references and index.
Identifiers: LCCN 2016012342 | ISBN 9780262529518 (pbk. : alk. paper)
eISBN 9780262337588
Subjects: LCSH: Machine learning. | Artificial intelligence.
Classification: LCC Q325.5 .A47 2016 | DDC 006.3/1dc23 LC record available at https://lccn.loc.gov/2016012342
ePub Version 1.0
Series Foreword
The MIT Press Essential Knowledge series offers accessible, concise, beautifully produced pocket-size books on topics of current interest. Written by leading thinkers, the books in this series deliver expert overviews of subjects that range from the cultural and the historical to the scientific and the technical.
In todays era of instant information gratification, we have ready access to opinions, rationalizations, and superficial descriptions. Much harder to come by is the foundational knowledge that informs a principled understanding of the world. Essential Knowledge books fill that need. Synthesizing specialized subject matter for nonspecialists and engaging critical topics through fundamentals, each of these compact volumes offers readers a point of access to complex ideas.
Bruce Tidor
Professor of Biological Engineering and Computer Science
Massachusetts Institute of Technology
Preface
A quiet revolution has been taking place in computer science for the last two decades. Nowadays, more and more, we see computer programs that learnthat is, software that can adapt their behavior automatically to better match the requirements of their task. We now have programs that learn to recognize people from their faces, understand speech, drive a car, or recommend which movie to watchwith promises to do more in the future.
Once, it used to be the programmer who defined what the computer had to do, by coding an algorithm in a programming language. Now for some tasks, we do not write programs but collect data. The data contains instances of what is to be done, and the learning algorithm modifies a learner program automatically in such a way so as to match the requirements specified in the data.
Since the advent of computers in the middle of the last century, our lives have become increasingly computerized and digital. Computers are no longer just the numeric calculators they once were. Databases and digital media have taken the place of printing on paper as the main medium of information storage, and digital communication over computer networks supplanted the post as the main mode of information transfer. First with the personal computer with its easy-to-use graphical interface, and then with the phone and other smart devices, the computer has become a ubiquitous device, a household appliance just like the TV or the microwave. Nowadays, all sorts of information, not only numbers and text but also image, video, audio, and so on, are stored, processed, andthanks to online connectivitytransferred digitally. All this digital processing results in a lot of data, and it is this surge of datawhat we can call a dataquakethat is mainly responsible for triggering the widespread interest in data analysis and machine learning.
For many applicationsfrom vision to speech, from translation to roboticswe were not able to devise very good algorithms despite decades of research beginning in the 1950s. But for all these tasks it is easy to collect data, and now the idea is to learn the algorithms for these automatically from data, replacing programmers with learning programs. This is the niche of machine learning, and it is not only that the data continuously has got bigger in these last two decades, but also that the theory of machine learning to process that data to turn it into knowledge has advanced significantly.
Today, in different types of business, from retail and finance to manufacturing, as our systems are computerized, more data is continuously generated and collected. This is also true in various fields of science, from astronomy to biology. In our everyday lives too, as digital technology increasingly infiltrates our daily existence, as our digital footprint deepens, not only as consumers and users but also through social media, an increasingly larger part of our lives is recorded and becomes data. Whatever its sourcebusiness, scientific, or personaldata that just lies dormant passively is not of any use, and smart people have been finding new ways to make use of that data and turn it into a useful product or service. In this transformation, machine learning is playing a more significant role.
Our belief is that behind all this seemingly complex and voluminous data, there lies a simple explanation. That although the data is big, it can be explained in terms of a relatively simple model with a small number of hidden factors and their interaction. Think about millions of customers who buy thousands of products online or from their local supermarket every day. This implies a very large database of transactions; but what saves us and works to our advantage is that there is a pattern to this data. People do not shop at random. A person throwing a party buys a certain subset of products, and a person who has a baby at home buys a different subsetthere are hidden factors that explain customer behavior. It is this inference of a hidden modelnamely, the underlying factors and their interactionfrom the observed data that is at the core of machine learning.
Machine learning is not just the commercial application of methods to extract information from data; learning is also a requisite of intelligence. An intelligent system should be able to adapt to its environment; it should learn not to repeat its mistakes but to repeat its successes. Previously, researchers used to believe that for artificial intelligence to become reality, we needed a new paradigm, a new type of thinking, a new model of computation, or a whole new set of algorithms. Taking into account the recent successes in machine learning in various domains, it can now be claimed that what we need is not a set of new specific algorithms but a lot of example data and sufficient computing power to run the learning methods on that much data, bootstrapping the necessary algorithms from data.
Next page