Practical Time Series Forecasting:
A Hands-On Guide
Galit Shmueli
University of Maryland, USA and
Indian School of Business, India
Copyright 2011 and Statistics.com LLC
Cover art: Lakhang in Eastern Bhutan, copyright 2011
ALL RIGHTS RESERVED. Printed in the United States of America. No part of this work may be used or reproduced, transmitted, stored or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribution, information networks or information storage and retrieval systems, or in any manner whatsoever without prior written permission.
For further information see www.galitshmueli.com
Second Edition, December 2011
Contents
1.1
1.2
1.3
1.4
1.5
2.1
2.2
2.3
2.4
2.5
2.6
3.1
3.2
3.3
3.4
3.5
4.1
4.2
4.3
4.4
4.5
5.1
5.2
5.3
5.4
5.5
5.6
5.7
6.1
6.2
6.3
6.4
6.5
6.6
6.7
7.1
7.2
7.3
7.4
7.5
8.1
8.2
8.3
8.4
8.5
9.1
9.2
9.3
To Boaz Shmueli, who made the production of this book series a reality
Preface
The purpose of this textbook is to introduce the reader to quantitative forecasting of time series in a practical and hands-on fashion. From my experience, learning is best achieved by doing. Hence, the book is designed to achieve self-learning in the following ways:
- The book is relatively short compared to other time series textbooks, to reduce reading time and increase hands-on time.
- Explanations strive to be clear and straightforward with more emphasis on concepts than on theory.
- Chapters include end-of-chapter problems, ranging in their focus from conceptual to hands-on exercises, with many requiring running software on real data and interpreting the output in light of a given problem.
- Real data are used to illustrate the methods throughout the book.
- The book emphasizes the entire forecasting process rather than focusing only on particular models and algorithms.
- Cases are given in the last chapter, guiding the reader through suggested steps, but allowing self-solution. Working on the cases should help integrate the information and experience gained.
Course Plan
The book is designed for a mini-semester (6-7 weeks) forecasting course at the graduate or upper-undergraduate level. A suggested schedule is:
Week 1
Chapters ("Data") cover goal definition; data collection, characterization, visualization, and pre-processing.
Week 2
Chapter ("Performance Evaluation") covers data partitioning, naive forecasts, measuring predictive accuracy and uncertainty.
Weeks 3-4
Chapter ("Regression-Based Models") covers linear regression models, second-level models, autocorrelation and AR models.
Week 5
Chapter ("Smoothing Methods") covers moving average, exponential smoothing, and differencing.
Week 6
Chapter .
Week 7 (optional)
Chapter ("Other Forecasting Methods") discusses the inclusion of external information and forecasting binary outcomes. In introduces the methods of logistic regression and neural networks.
A team project is highly recommended in such a course, where students work on a real or realistic problem using real data.
Software and Data
An Excel add-on, called XLMiner (www.solver.com/xlminer), is used throughout the book to illustrate the different methods and procedures. This choice reduces the software learning curve for those comfortable with Microsoft Excel. However, the book is written in a way that allows readers to implement methods in any software of choice. The free XLMiner demo version should suffice for running all the required analyses except for some of the time series in Chapter , which exceed 200 time points.
Other software packages that support forecasting and the methods in this book are Minitab (www.minitab.com) and JMP (www.jmp.com), both reasonably priced menu-driven statistical software packages. For open-source aficionados, R software (www.r-project.org with library forecast at robjhyndman.com/software/forecast) is an excellent choice, although it requires learning how to program in R.
Finally, we advocate using interactive visualization software for exploring the nature of the data before attempting any modeling, especially when many series are involved. Two such packages are TIBCO Spotfire (spotfire.tibco.com) and Tableau (www.tableausoftware.com). We illustrate the power of these packages in Chapter , and the book website provides two interactive dashboards for experiencing the power of interactive exploration of time series.
Datasets used in the chapter exercises and examples and in the case are publicly available at
www.galitshmueli.com/practical-time-series-forecasting.
New to the Second Edition
In line with feedback from readers and instructors, the second edition now offers new content and improved organization.
Chapter "Approaching Forecasting" (Chapter ), which covers issues related to data collection, characterization, exploration and pre-processing (covering missing values, unequal spacing, extreme values, and choice of series time span). The section "Data Exploration" now goes beyond static plots and introduces interactive visualization.
A separate new chapter on "Performance Evaluation" (Chapter ) now includes discussions of data partitioning, naive forecasts, forecast error distributions and prediction intervals.
A new chapter "Forecasting Methods: Overview" (Chapter ) introduces the reader to main classes of forecasting methods (model-based vs. data-driven, extrapolation methods vs. econometric models and external information) and describes the concept of forecast combination and ensembles.
The chapter on "Regression-Based Models" (Chapter ) was expanded to cover the use of sinusoidal functions for capturing seasonality and a new section was added on handling irregular patterns.
In Chapter , "Smoothing Methods", separate sections are dedicated to additive and multiplicative trends, and a discussion is provided about the difference between additive and multiplicative seasonality. In addition, a new section on extensions of exponential smoothing describes recent developments in exponential smoothing research for forecasting series with multiple seasonal cycles, and the consideration of time-varying smoothing constants.
A new chapter "Other Forecasting Methods" (Chapter ) discusses two additional forecasting scenarios: including external information and forecasting binary events. Two additional forecasting methods are introduced for handling such scenarios: logistic regression and neural networks.
Chapter , "Communication and Maintenance", is another new chapter and discusses various issues that requires consideration at the end of the forecasting process when the forecaster interfaces stakeholders. These include oral and written presentations, forecast documentation and monitoring, and addressing "forecast adjustments".
Chapter includes two new cases: one on forecasting tourism-related series and the other on forecasting stock-price movements. Each case includes a guided assignment with tips and further resources.
The book closes with two new additions: a list of resources on time series data and forecasting competitions and a bibliography listing all the citations mentioned in the book.