Sentiment Analysis
Mining Opinions, Sentiments, and Emotions
Sentiment analysis is the computational study of people's opinions, sentiments, emotions, and attitudes. This fascinating problem is increasingly important in business and society. It offers numerous research challenges but promises insight useful to anyone interested in opinion analysis and social media analysis.
This book gives a comprehensive introduction to the topic from a primarily natural language processing point of view to help readers understand the underlying structure of the problem and the language constructs that are commonly used to express opinions and sentiments. It covers all core areas of sentiment analysis; includes many emerging themes, such as debate analysis, intention mining, and fake-opinion detection; and presents computational methods to analyze and summarize opinions. It will be a valuable resource for researchers and practitioners in natural language processing, computer science, management sciences, and the social sciences.
Bing Liu is a professor of computer science at the University of Illinois at Chicago. His current research interests include sentiment analysis and opinion mining, data mining, machine learning, and natural language processing. He has published extensively in top conferences and journals, and his research has been cited on the front page of the New York Times . He is also the author of two books: Sentiment Analysis and Opinion Mining (2012) and Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (first edition, 2007; second edition, 2011). He currently serves as the Chair of ACM SIGKDD and is an IEEE Fellow.
Sentiment Analysis
Mining Opinions, Sentiments, and Emotions
Bing Liu
University of Illinois at Chicago
32 Avenue of the Americas, New York, NY 10013-2473, USA
Cambridge University Press is part of the University of Cambridge.
It furthers the University's mission by disseminating knowledge in the pursuit of education, learning, and research at the highest international levels of excellence.
www.cambridge.org
Information on this title: www.cambridge.org/9781107017894
Bing Liu 2015
This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press.
First published 2015
Printed in the United States of America
A catalog record for this publication is available from the British Library .
Library of Congress Cataloging in Publication Data
Liu, Bing, 1963
Sentiment analysis : mining opinions, sentiments, and emotions / Bing Liu.
pages cm
Includes bibliographical references and index.
ISBN 978-1-107-01789-4 (hardback)
1. Natural language processing (Computer science) 2. Computational linguistics. 3. Public opinion Data processing. 4. Data mining. I. Title.
QA76.9.N38L58 2015
006.312dc23 2014036113
ISBN 978-1-107-01789-4 Hardback
Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party Internet websites referred to in this publication and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
Contents
Preface
Opinion and sentiment and their related concepts, such as evaluation, appraisal, attitude, affect, emotion, and mood, are about our subjective feelings and beliefs. They are central to human psychology and are key influencers of our behaviors. Our beliefs and perceptions of reality, as well as the choices we make, are to a considerable degree conditioned on how others see and perceive the world. For this reason, our views of the world are very much influenced by others views, and whenever we need to make a decision, we often seek out others opinions. This is true not only for individuals but also for organizations. From an application point of view, we naturally want to mine people's opinions and feelings toward any subject matter of interest, which is the task of sentiment analysis . More precisely, sentiment analysis, which is also called opinion mining , is a field of study that aims to extract opinions and sentiments from natural language text using computational methods.
The inception and rapid growth of sentiment analysis coincide with those of social media on the web, such as reviews, forum discussions, blogs, and microblogs, because for the first time in human history, we now have a huge volume of opinion data recorded in digital forms. These data, also called user-generated content , prompted researchers to mine them to discover useful knowledge. This naturally led to the problem of sentiment analysis or opinion mining because these data are full of opinions. That these data are full of opinions is not surprising, because the primary reason why people post messages on social media platforms is to express their views and opinions, and therefore sentiment analysis is at the very core of social media analysis. Since early 2000, sentiment analysis has grown to be one of the most active research areas in natural language processing. It is also widely studied in data mining, web mining, and information retrieval. In fact, the research has spread from computer science to management science and social science because of its importance to business and society as a whole. In recent years, industrial activities surrounding sentiment analysis have also thrived. Numerous start-ups have emerged. Many large corporations, for example, Microsoft, Google, Hewlett-Packard, and Adobe, have also built their own in-house systems. Sentiment analysis systems have found applications in almost every business, health, government, and social domain.
Although no silver bullet algorithm can solve the sentiment analysis problem, many deployed systems are able to provide useful information to support real-life applications. I believe it is now a good time to document the knowledge that we have gained in research, and, to some extent, in practice, in a book. Obviously, I don't claim that I know everything that is happening in the industry, as businesses do not publish or disclose their algorithms. However, I have built a sentiment analysis system myself in a start-up company and served clients on projects involving social media data sets in a large variety of domains. Over the years, many developers of sentiment analysis systems in the industry have also told me roughly what algorithms they were using. Thus, I can claim that I have a reasonable knowledge of practical systems and their capabilities and firsthand experience in solving real-life problems. I try to pass along those nonconfidential pieces of information and knowledge in this book.
In writing this book, I aimed to take a balanced approach, analyzing the sentiment analysis problem from a linguistic angle to help readers understand the underlying structure of the problem and the language constructs commonly used to express opinions and sentiments and presenting computational methods to analyze and summarize opinions. Like many natural language processing tasks, most published computational techniques use machine learning or data mining algorithms with the help of text-specific clues or features. However, if we only focus on such computational algorithms, we will miss the deep insights of the problem, which in turn will hinder our progress on the computational front. Most existing machine learning algorithms are black boxes. They do not produce human-interpretable models. When something goes wrong, it is hard to know the cause and how to fix it.
Next page