Editor Board
The Morgan Kaufmann Series in Multimedia Information and Systems
Series Editor, Edward A. Fox, Virginia Polytechnic University
Introduction to Data Compression, Third Edition
Khalid Sayood
Understanding Digital Libraries, Second Edition
Michael Lesk
Bioinformatics: Managing Scientific Data
Zoe Lacroix and Terence Critchlow
How to Build a Digital Library
Ian H. Witten and David Bainbridge
Digital Watermarking
Ingemar J. Cox, Matthew L. Miller, and Jeffrey A. Bloom
Readings in Multimedia Computing and Networking
Edited by Kevin Jeffay and HongJiang Zhang
Introduction to Data Compression, Second Edition
Khalid Sayood
Multimedia Servers: Applications, Environments, and Design
Dinkar Sitaram and Asit Dan
Managing Gigabytes: Compressing and Indexing Documents and Images, Second Edition
Ian H. Witten, Alistair Moffat, and Timothy C. Bell
Digital Compression for Multimedia: Principles and Standards
Jerry D. Gibson, Toby Berger, Tom Lookabaugh, Dave Lindbergh, and Richard L. Baker
Readings in Information Retrieval
Edited by Karen Sparck Jones and Peter Willett
Copyright
Acquiring Editor: Andrea Dierna
Development Editor: Meagan White
Project Manager: Danielle S. Miller
Designer: Eric DeCicco
Morgan Kaufmann is an imprint of Elsevier
225 Wyman Street, Waltham, MA 02451, USA
2012 Elsevier, Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publishers permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods or professional practices, may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information or methods described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
Sayood, Khalid.
Introduction to data compression / Khalid Sayood. 4th ed.
p. cm.
ISBN 978-0-12-415796-5
1. Data compression (Telecommunication) 2. Coding theory. I. Title.
TK5102.92.S39 2012
005.746dc23
2012023803
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
ISBN: 978-0-12-415796-5
Printed in the United States of America
12 13 14 1510 9 8 7 6 5 4 3 2 1
Dedication
To Fsun
Preface
Data compression has been an enabling technology for the information revolution, and as this revolution has changed our lives, data compression has become more and more a ubiquitous, if often invisible, presence. From mp3 players, to smartphones, to digital television and movies, data compression is an integral part of almost all information technology. This incorporation of compression into more and more of our lives also points to a certain degree of maturation and stability of the technology. This maturity is reflected in the fact that there are fewer differences between each edition of this book. In the second edition we had added new techniques that had been developed since the first edition of this book came out. In the third edition we added a chapter on audio compression, a topic that had not been adequately covered in the second edition. In this edition we have tried to do the same with wavelet based compression, in particular with the increasingly popular JPEG 2000 standard. There are now two chapters dealing with wavelet based compression, one devoted exclusively to wavelet-based image compression algorithms. We have also filled in details that were left out from previous editions, such as a description of canonical Huffman codes and more information on binary arithmetic coding. We have also added descriptions of techniques that have been motivated by the internet, such as the speech coding algorithms used for internet applications.
All this has yet again enlarged the book. However, the intent remains the same: to provide an introduction to the art or science of data compression. There is a tutorial description of most of the popular compression techniques followed by a description of how these techniques are used for image, speech, text, audio, and video compression. One hopes the size of the book will not be intimidating. Once you open the book and begin reading a particular section we hope you will find the content easily accessible. If some material is not clear write to me at with specific questions and I will try and help (homework problems and projects are completely your responsibility).
1 Audience
If you are designing hardware or software implementations of compression algorithms, or need to interact with individuals engaged in such design, or are involved in development of multimedia applications and have some background in either electrical or computer engineering, or computer science, this book should be useful to you. We have included a large number of examples to aid in self-study. We have also included discussion of various multimedia standards. The intent here is not to provide all the details that may be required to implement a standard but to provide information that will help you follow and understand the standards documents. The final authority is always the standards document.
2 Course Use
The impetus for writing this book came from the need for a self-contained book that could be used at the senior/graduate level for a course in data compression in either electrical engineering, computer engineering, or computer science departments. There are problems and project ideas after most of the chapters. A solutions manual is available from the publisher. Also at