Data Structures the Fun Way
An Amusing Adventure with Coffee-Filled Examples
Jeremy Kubica
DATA STRUCTURES THE FUN WAY. Copyright 2023 by Jeremy Kubica.
All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior written permission of the copyright owner and the publisher.
First printing
26 25 24 23 22 1 2 3 4 5
ISBN-13: 978-1-7185-0260-4 (print)
ISBN-13: 978-1-7185-0261-1 (ebook)
Publisher: William Pollock
Managing Editor: Jill Franklin
Production Manager: Rachel Monaghan
Production Editor: Katrina Horlbeck Olsen
Developmental Editors: Liz Chadwick and Abigail Schott-Rosenfield
Cover Illustrator: Gina Redman
Interior Design: Octopod Studios
Technical Reviewer: Daniel Zingaro
Copyeditor: Paula L. Fleming
Compositor: Jeff Wilson, Happenstance Type-O-Rama
Proofreader: Liz Wheeler
For information on distribution, bulk sales, corporate sales, or translations, please contact No Starch Press, Inc. directly at info@nostarch.com or:
No Starch Press, Inc.
245 8th Street, San Francisco, CA 94103
phone: 1.415.863.9900
www.nostarch.com
Library of Congress Control Number: 2022020565
No Starch Press and the No Starch Press logo are registered trademarks of No Starch Press, Inc. Other product and company names mentioned herein may be the trademarks of their respective owners. Rather than use a trademark symbol with every occurrence of a trademarked name, we are using the names only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark.
The information in this book is distributed on an As Is basis, without warranty. While every precaution has been taken in the preparation of this work, neither the author nor No Starch Press, Inc. shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the information contained in it.
To Edith and Anthony
About the Author
Jeremy Kubica is an engineering director specializing in artificial intelligence and machine learning. He received a PhD in robotics from Carnegie Mellon University and a BS in computer science from Cornell University. He spent his graduate school years creating algorithms to detect killer asteroids (actually stopping them was, of course, left as future work). He is the author of multiple books designed to introduce people to computer science, including Computational Fairy Tales and The CS Detective (No Starch Press, 2016), as well as the Computational Fairy Tales blog.
About the Technical Reviewer
Dr. Daniel Zingaro is an associate teaching professor of computer science and award-winning teacher at the University of Toronto. His research focuses on understanding and enhancing student learning of computer science. He is the author of Algorithmic Thinking (No Starch Press, 2020), a no-nonsense, no-math guide to algorithms and data structures, and Learn to Code by Solving Problems (No Starch Press, 2021), a primer for learning Python and computational thinking.
Acknowledgments
I would like to start by thanking the whole team at No Starch Press who helped make this book a reality. A huge thank-you to Bill Pollock for starting this entire journey by suggesting I do a more technical follow-up to The CS Detective. Thank you to my amazing editors, Abigail Schott-Rosenfield and Liz Chadwick, for their excellent help, guidance, and suggestions. I would also like to thank Carlos Bueno, who originally pointed me to the team at No Starch.
Thank you to Daniel Zingaro for his thorough and insightful technical review. His work was vital in improving both the accuracy and understandability of the material.
A tremendous thanks goes out to all the people who provided valuable comments on earlier versions of this book: Bambi Brewer, Andrew Moore, Eleanor Rieffel, and Kit Stubbs, PhD. Their suggestions helped guide my approach and the direction of the book.
A deep thank-you to my family for their support. Thank you to Regan, Nathan, and Julie for their encouragement and patience.
Introduction
This is a book about computational thinking through the lens of data structures, constructs for organizing and storing data. It is more than a cookbook of handy data structures. Rather, it explores the thinking behind these structures and their fundamental impact on solving complex problems, using real-world analogies to make abstract computational concepts intuitive. The goal of this book is to provide new insights into how you can use preexisting structure within the data to your advantage or create new structures to efficiently solve problems.
Among other things, I discuss the differences between arrays and linked lists, the complexity and power of pointers, the effect of data structures on algorithmic behavior, the branching of tree-based data structures, mathematical mappings in hash tables, and the usefulness of randomization. In short, youll learn to think about algorithms by investigating different ways to organize the data they process. Youll also apply these computational approaches to real-world problems, a surprising number of which focus on procuring a decent cup of coffee.
Understanding how data structures function is critical to using them effectively. Just as an experienced carpenter wouldnt pound screws into wood with a hammer or use sandpaper to cut a two-by-four in half, an experienced programmer needs to choose the right tools for every job. As well see repeatedly throughout the following chapters, every data structure comes with tradeoffs. Saws cut through wood more effectively than sandpaper but create coarse edges. There is no single data structure that is perfect for every possible use case, but this is what makes computer science and the development of algorithms so interesting. A good computer scientist must understand how different data structures behave in order to determine where they can be best used.
This book focuses on a few canonical data structures and uses them to explore fundamental themes in computational thinking. Each of these data structures is a useful exemplar of a more general class of data structures and of a conceptual approach. For example, B-trees demonstrate one approach to the problems of keeping search trees balanced and optimizing for expensive memory accesses. I discuss the tradeoffs between memory usage and accuracy with Bloom filters; the use of randomization with skip lists; and how to capture multidimensional structure with grids, quadtrees, or k-d trees. As such, this book is neither an introduction to programming, a comprehensive anthology of data structures, nor a full analysis of brewing coffee (although we will touch repeatedly on this important topic). Our goals are differentto develop mental tools that apply across a range of specific problems and programming languages.
Intended Audience
This book is for anyone who wants to learn more about the thinking behind the data structures that lie at the heart of computer science. I assume such basic familiarity with programming as can be expected after taking an introductory course, participating in a boot camp, or working through a beginners programming book. Readers should be familiar with fundamental programming concepts such as variables, loops, and conditional statements. Some more adventurous readers might even have coded up some of the data structures or algorithms in this book already or might do so as they read through it. However, you wont need to know the specific details of particular programming languages or algorithms.