Table of Contents
List of Tables
- Chapter 2
- Chapter 3
- Chapter 4
- Chapter 5
- Chapter 6
- Chapter 8
List of Illustrations
- Chapter 2
- Chapter 3
- Chapter 4
- Chapter 5
- Chapter 6
- Chapter 7
- Chapter 8
Guide
Pages
PREDICTION REVISITED
THE IMPORTANCE OF OBSERVATION
MEGAN CZASONIS
MARK KRITZMAN
DAVID TURKINGTON
Copyright 2022 by Megan Czasonis, Mark Kritzman, and David Turkington. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data is Available:
ISBN 9781119895589 (hardback)
ISBN 9781119895602 (ePDF)
ISBN 9781119895596 (epub)
Cover Design: Wiley
Cover Image: akinbostanci/Getty Images
Timeline of Innovations
Relevance is the centerpiece of our approach to prediction. The key concepts that give rise to relevance were introduced over the past three centuries, as illustrated in this timeline. In , we offer more detail about the people who made these groundbreaking discoveries.
Essential Concepts
This book introduces a new approach to prediction, which requires a new vocabularynot new words, but new interpretations of words that are commonly understood to have other meanings. Therefore, to facilitate a quicker understanding of what awaits you, we define some essential concepts as they are used throughout this book. And rather than follow the convention of presenting them alphabetically, we present them in a sequence that matches the progression of ideas as they unfold in the following pages.
- Observation: One element among many that are described by a common set of attributes, distributed across time or space, and which collectively provide guidance about an outcome that has yet to be revealed. Classical statistics often refers to an observation as a multivariate data point.
- Attribute: A recorded value that is used individually or alongside other attributes to describe an observation. In classical statistics, attributes are called independent variables.
- Outcome: A measurement of interest that is usually observed alongside other attributes, and which one wishes to predict. In classical statistics, outcomes are called dependent variables.
- Arithmetic average: A weighted summation of the values of attributes or outcomes that efficiently aggregates the information contained in a sample of observations. Depending on the context and the weights that are used, the result may be interpreted as a typical value or as a prediction of an unknown outcome.
- Spread: The pairwise distance between observations of an attribute, measured in units of surprise. We compute this distance as the average of half the squared difference in values across every pair of observations. In classical statistics, the same quantity is usually computed as the average of squared deviations of observations from their mean and is referred to as variance. However, the equivalent evaluation of pairwise spreads reveals why we must divide by N 1 rather than N to obtain an unbiased estimate of a sample's variance; it is because the zero distance of an observation with itself (the diagonal in a matrix of pairs) conveys no information.
- Information theory: A unified mathematical theory of communication, created by Claude Shannon, which expresses messages as sequences of 0s and 1s and, based on the inverse relationship of information and probability, prescribes the optimal redundancy of symbols to manage the speed and accuracy of transmission.
- Circumstance: A set of attribute values that collectively describes an observation.
- Informativeness: A measure of the information conveyed by the circumstances of an observation, based on the inverse relationship of information and probability. For an observation of a single attribute, it is equal to the observed distance from the average, squared. For an observation of two or more uncorrelated attributes, it is equal to the sum of each individual attribute's informativeness. For an observation of two or more correlated attributesthe most general caseit is given by the Mahalanobis distance of the observation from the average of the observations. Informativeness is a component of relevance. It does not depend on the units of measurement.
- Co-occurrence:
Next page