Praise for Presto: The Definitive Guide
This book provides a great introduction to Presto and teaches you everything you need to know to start your successful usage of Presto.
Dain Sundstrom and David Phillips, Creators of the Presto Projects and Founders of the Presto Software Foundation
Presto plays a key role in enabling analysis at Pinterest. This book covers the Presto essentials, from use cases through how to run Presto at massive scale.
Ashish Kumar Singh, Tech Lead, Bigdata Query Processing Platform, Pinterest
Presto has set the bar in both community-building and technical excellence for lightning-fast analytical processing on stored data in modern cloud architectures. This book is a must-read for companies looking to modernize their analytics stack.
Jay Kreps, Cocreator of Apache Kafka, Cofounder and CEO of Confluent
Presto has saved us allboth in academia and industrycountless hours of work, allowing us all to avoid having to write code to manage distributed query processing. Were so grateful to have a high-quality open source distributed SQL engine to start from, enabling us to focus on innovating in new areas instead of reinventing the wheel for each new distributed data system project.
Daniel Abadi, Professor of Computer Science, University of Maryland, College Park
Presto: The Definitive Guide
by Matt Fuller , Manfred Moser , and Martin Traverso
Copyright 2020 Matt Fuller, Martin Traverso, and Simpligility Technologies Inc. All rights reserved.
Printed in the United States of America.
Published by OReilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com .
- Acquisition Editor: Jonathan Hassell
- Development Editor: Michele Cronin
- Production Editor: Elizabeth Kelly
- Copyeditor: Sharon Wilkey
- Proofreader: Piper Editorial
- Indexer: Potomac Indexing, LLC
- Interior Designer: David Futato
- Cover Designer: Karen Montgomery
- Illustrator: Rebecca Demarest
- April 2020: First Edition
Revision History for the First Edition
- 2020-04-03: First release
See http://oreilly.com/catalog/errata.csp?isbn=9781492044277 for release details.
The OReilly logo is a registered trademark of OReilly Media, Inc. Presto: The Definitive Guide, the cover image, and related trade dress are trademarks of OReilly Media, Inc.
The views expressed in this work are those of the authors, and do not represent the publishers views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
978-1-492-04427-7
[LSI]
Foreword
What a tremendous ride it has been so far! Looking back at the time when westarted the Presto project at Facebook in 2012, we certainly thought that we weregoing to create something useful. We always planned to have a successful opensource project and community, and we released Presto in 2013 under the ApacheLicense.
How far Presto has come since then, however, is beyond what weimagined. We are proud of the project communitys accomplishments, but, more importantly, we arevery humbled by all the positive feedback and help we have received.
Presto has grown tremendously and provided a lot of value to its large communityof users. You can find fellow Presto community members across the globe, anddevelopers in Brazil, Canada, China, Germany, India, Israel, Japan, Poland,Singapore, the United States, the United Kingdom, and other countries.
Launching the Presto Software Foundation in early 2019 was another majormilestone. The not-for-profit organization is dedicated to theadvancement of the Presto open source distributed SQL engine. The foundation iscommitted to ensuring that the project remains open, collaborative, and independentfor decades to come.
Now, about one year after the launch of the foundation, we can look back at anaccelerated rate of impressive contributions from a larger community.
We are pleased that Matt, Manfred, and Martin created this book about Presto withthe help of OReilly. It provides a great introduction to Presto and teachesyou everything you need to know to start using it successfully.
Enjoy the journey into the depths of Presto and the related world of businessintelligence, reporting, dashboard creation, data warehousing, data mining,machine learning, and beyond.
Of course, make sure to dive into the additional resources and help we offer onthe Presto website at https://prestosql.io, the communitychat, the source repository, and beyond.
Welcome to the Presto community!
Dain Sundstrom and David Phillips
Creators of the Presto Projects and Founders of the Presto Software Foundation
Preface
About the Book
Presto: The Definitive Guide is the first and foremost book about the Prestodistributed query engine. The book is aimed at beginners and existing users ofPresto alike. Ideally, you have some understanding of databases and SQL, but ifnot, you can divert from reading and look things up while working your waythrough this book. No matter your level of expertise, we are sure thatyoull learn something new from this book.
The first part of the book introduces you to Presto and then helps you get upand running quickly so you can start learning how to use it. This includes installation and first use of the command-line interface as well as many client- and web-based applications, such as SQL database management or dashboard and reporting tools, using the JDBC driver.
The second part of the book advances your knowledge and includes detailsabout the Presto architecture, cluster deployment, many connectors to data sources, and a lot of information about the main power of Prestoquerying any data source with SQL.
The third part of the book rounds out the content with further aspects you needto know when running and using a production Presto deployment. This includesWeb UI usage, security configuration, and some discussion of real-world uses ofPresto in other organizations.
Conventions Used in This Book
The following typographical conventions are used in this book:
ItalicIndicates new terms, URLs, email addresses, filenames, and fileextensions.
Constant width
Used for program listings, as well as within paragraphs torefer to program elements such as variable or function names, databases, datatypes, environment variables, statements, and keywords.
Constant width bold
Shows commands or other text that should be typedliterally by the user.
Constant width italic
Shows text that should be replaced withuser-supplied values or by values determined by context.
Tip
This element signifies a tip or suggestion.