Black Hat GraphQL
Attacking Next Generation APIs
by Nick Aleks and Dolev Farhi
BLACK HAT GRAPHQL. Copyright 2023 by Nick Aleks and Dolev Farhi.
All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior written permission of the copyright owner and the publisher.
First printing
27 26 25 24 23 1 2 3 4 5
ISBN-13: 978-1-7185-0284-0 (print)
ISBN-13: 978-1-7185-0285-7 (ebook)
Publisher: William Pollock
Managing Editor: Jill Franklin
Production Manager: Sabrina Plomitallo-Gonzlez
Production Editor: Jennifer Kepler
Developmental Editor: Frances Saux
Cover Illustrator: Rick Reese
Interior Design: Octopod Studios
Technical Reviewer: Corey Ball
Copyeditor: Sharon Wilkey
Compositor: Maureen Forys, Happenstance Type-O-Rama
Proofreader: James Fraleigh
For information on distribution, bulk sales, corporate sales, or translations, please contact No Starch Press, Inc. directly at info@nostarch.com or:
No Starch Press, Inc.
245 8th Street, San Francisco, CA 94103
phone: 1.415.863.9900
www.nostarch.com
Library of Congress Control Number: 2022046393
No Starch Press and the No Starch Press logo are registered trademarks of No Starch Press, Inc. Other product and company names mentioned herein may be the trademarks of their respective owners. Rather than use a trademark symbol with every occurrence of a trademarked name, we are using the names only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark.
The information in this book is distributed on an As Is basis, without warranty. While every precaution has been taken in the preparation of this work, neither the authors nor No Starch Press, Inc. shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the information contained in it.
About the Authors
Nick Aleks is a leader in Torontos cybersecurity community and a distinguished and patented security engineer, speaker, and researcher. He is currently the senior director of security at Wealthsimple; leads his own security firm, ASEC.IO; and is a senior advisory board member for HackStudent, George Brown College, and the University of Guelphs Master of Cybersecurity and Threat Intelligence program. A founder of DEFCON Toronto (DC416), he specializes in offensive security and penetration testing and has over 10 years of experience hacking everything from websites to safes, locks, cars, drones, and even smart buildings.
Dolev Farhi is a security engineer and author with extensive experience leading security engineering teams in the fintech and cybersecurity industries. Currently, he is a distinguished security engineer at Palo Alto Networks, building defenses for the largest cybersecurity company in the world. He has worked for several fintech and security firms and provided training for official Linux certification tracks. He is also one of the founders of DEFCON Toronto (DC416), a popular Toronto-based hacker group. In his spare time, he enjoys researching vulnerabilities in IoT devices, building open source offensive security tools, participating in and building CTF challenges, and contributing exploits to Exploit-DB.
About the Technical Reviewer
Corey Ball is the author of Hacking APIs (No Starch Press, 2022) and senior manager of penetration testing at Moss Adams. He has over 12 years of experience working in IT and cybersecurity across several industries. He is the creator of the APIsec University, a free resource where anyone can learn about API security. In addition to a bachelors degree in English and philosophy from Sacramento State University, he holds the OSCP, CCISO, CISSP, and several other certifications.
Foreword
Today, building software and systems is a lot like assembling an IKEA kitchenon your front lawn. People are taking parsers, utilities, and other components originally intended for use with trusted data by a person on their own command line, and exposing them to the internet. With each new query language and interpreter/parser combination (GraphQL being one of the more recent), the old becomes new again.
Vulnerability classes like denial of service (DoS), injection, information disclosure, and authentication/authorization bypasses have persisted in pretty much every data format and language parsed with regular expressions over the course of my career. Some of this is because inherent weaknesses exist in the underlying technology that arent well understood by developers of new languages. But its more than a technology problem that makes these classes of vulnerabilities hard to solve. Its an ecosystem problem.
In most cases, because of the inherent design of the components being exposed to the internet, layering security controls on top of them is challenging to do without losing functionality or efficiency. Take regular expressions themselves: the ability to self-reference and back-reference is what makes them so powerful, but that same ability also creates an inherent DoS risk. To parse a statement, a regular expression can back-reference or self-reference as many times as necessary. Yet for an attacker, necessary might mean until you pay me to stop.
Developers can reasonably assume that command line users working on their own systems will submit well-formulated requests, designed to end in computationally reasonable times. After all, who would DoS themselves, except by accident? But that foundational assumption doesnt hold true on the internet. Even for those incredibly rare people who consider and understand how online threats invalidate the fundamental design assumptions of the component theyre reusing, compensating for a design decision is tricky. More commonly, people dont even know theres a problem to consider.
Then you have the fact that usability is a thing. Most of our internet-facing technology is supposed to be forgiving in the case of errors so that our lowest-common-denominator internet users can handle it. It should be autocorrecting so that errors are handled gracefully. And, at the same time, that technology needs to be secure against the most technically savvy, bored, or determined attackers. No effective self-correcting and communicative system can also keep a person from inferring that data is correct or has been corrected. A shrewd user with no prior knowledge of the system can often infer the data it contains by making a short series of educated guesses and abusing the communicative aspects of the technology. This ability to infer and then confirm is the source of many subtle information disclosure risks.
In a broader sense, many of the specifications for these data formats and languages are insecure as a consequence of the design process. Standards for things like PDFs and images often include a mishmash of requirements dictated by the biggest vendors at the time that the standard was made. The core specification contains what the vendors could agree on, while optional items accommodate each vendors peculiar features and design decisions. The patchwork created by committees with vested interests doesnt exactly inspire the group to think about security. And as data becomes the new currency, committees are almost deliberately adding privacy and security risks to standards so that companies can continue to perform data collection (and profit accordingly).
Next page