• Complain

Anthony Virtuoso - Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure

Here you can read online Anthony Virtuoso - Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2021, publisher: Packt Publishing, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Anthony Virtuoso Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure

Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Get more from your data with Amazon Athenas ease-of-use, interactive performance, and pay-per-query pricing

Key Features
  • Explore the promising capabilities of Amazon Athena and Athenas Query Federation SDK
  • Use Athena to prepare data for common machine learning activities
  • Cover best practices for setting up connectivity between your application and Athena and security considerations
Book Description

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using SQL, without needing to manage any infrastructure.

This book begins with an overview of the serverless analytics experience offered by Athena and teaches you how to build and tune an S3 Data Lake using Athena, including how to structure your tables using open-source file formats like Parquet. Youll learn how to build, secure, and connect to a data lake with Athena and Lake Formation. Next, youll cover key tasks such as ad hoc data analysis, working with ETL pipelines, monitoring and alerting KPI breaches using CloudWatch Metrics, running customizable connectors with AWS Lambda, and more. Moving on, youll work through easy integrations, troubleshooting and tuning common Athena issues, and the most common reasons for query failure. You will also review tips to help diagnose and correct failing queries in your pursuit of operational excellence. Finally, youll explore advanced concepts such as Athena Query Federation and Athena ML to generate powerful insights without needing to touch a single server.

By the end of this book, youll be able to build and use a data lake with Amazon Athena to add data-driven features to your app and perform the kind of ad hoc data analysis that often precedes many of todays ML modeling exercises.

What you will learn
  • Secure and manage the cost of querying your data
  • Use Athena ML and User Defined Functions (UDFs) to add advanced features to your reports
  • Write your own Athena Connector to integrate with a custom data source
  • Discover your datasets on S3 using AWS Glue Crawlers
  • Integrate Amazon Athena into your applications
  • Setup Identity and Access Management (IAM) policies to limit access to tables and databases in Glue Data Catalog
  • Add an Amazon SageMaker Notebook to your Athena queries
  • Get to grips with using Athena for ETL pipelines
Who this book is for

Business intelligence (BI) analysts, application developers, and system administrators who are looking to generate insights from an ever-growing sea of data while controlling costs and limiting operational burden, will find this book helpful. Basic SQL knowledge is expected to make the most out of this book.

Table of Contents
  1. Your First Query
  2. Introduction to Amazon Athena
  3. Key Features, Query Types, and Functions
  4. Metastores, Data Sources, and Data Lakes
  5. Securing Your Data
  6. AWS Glue and AWS Lake Formation
  7. Ad Hoc Analytics
  8. Querying Unstructured and Semi-Structured Data
  9. Serverless ETL Pipelines
  10. Building Applications with Amazon Athena
  11. Operational Excellence - Maintenance, Optimization, and Troubleshooting
  12. Athena Query Federation
  13. Athena UDFs and ML
  14. Lake Formation Advanced Topics

Anthony Virtuoso: author's other books


Who wrote Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure? Find out the surname, the name of the author of the book and a list of all author's works by series.

Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Serverless Analytics with Amazon Athena Query structured unstructured or - photo 1
Serverless Analytics with Amazon Athena

Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure

Anthony Virtuoso

Mert Turkay Hocanin

Aaron Wishnick

BIRMINGHAMMUMBAI Serverless Analytics with Amazon Athena Copyright 2021 Packt - photo 2

BIRMINGHAMMUMBAI

Serverless Analytics with Amazon Athena

Copyright 2021 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Kunal Parikh

Publishing Product Manager: Devika Battike

Senior Editor: David Sugarman

Content Development Editor: Joseph Sunil

Technical Editor: Rahul Limbachiya

Copy Editor: Safis Editing

Project Coordinator: Aparna Nair

Proofreader: Safis Editing

Indexer: Tejal Soni

Production Designer: Shankar Kalbhor

First published: November 2021

Production reference: 1131021

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80056-234-9

www.packt.com

To my wife, Cristina, thank you for the support and understanding as I spent late nights and early mornings working on this book. I also appreciate all the laughs we had over my terrible spelling. For my sons, Luca and Massimo, who worked on their own pop-up books alongside me; I'll be first in line for an advanced copy of your books.

Anthony Virtuoso

I dedicate this book to my wife, Subrina, who has been incredibly supportive, and our son, Tristan, who was born while writing this book. Without the both of you and the encouragement and love you gave me, this book would not have been possible. I also want to thank my parents, siblings, and everyone else who helped make this possible.

Mert Turkay Hocanin

Foreword

Creating a data strategy is a top priority for leading organizations. That's because with any major initiative, from creating new experiences to building new revenue streams, leaders must be able to quickly gather insights and get to the truth. Data-driven organizations seek the truth by treating data like an organizational asset, no longer the property of individual departments. They set up processes to collect and store valuable data. Their data is democratized, meaning it's available to the right people and systems that need it. And their data is used to build new and innovative products that use data and machine learning (ML) to deliver new customer experiences.

AWS offers the broadest and deepest set of services for analytics and ML, and Amazon Athena is a key pillar of our offerings. Amazon Athena is a serverless analytics service that enables customers to use standard SQL to analyze all the data in their Amazon S3 data lakes, their data warehouses, and their transactional databases, as well as data that lives on-premises, in SaaS applications, and in other clouds. In other words, with Athena, you can query all your data from a single place using a language familiar to most analysts, using any business intelligence or ML tools you'd like. It's really all about having all your data at your fingertips.

I am incredibly lucky to have worked on creating and launching virtually all of the analytics offerings from AWS over the past decade. I was part of the team that created the original vision for Athena and launched the service in 2016. We created Athena because customers wanted a way to query all their data, both the structured data from databases as well as the semi-structured and unstructured data in their data lakes and other data sources, without having to manage infrastructure or give up SQL or the standard tools they were already using. We launched Athena at re:Invent 2016 and have been iterating on and improving the service ever since.

Mert, Aaron, and Anthony were founding members of the Amazon Athena team and have played pivotal roles in defining, building, and evolving the service. They are deeply passionate engineers who love helping customers succeed with Athena and with analytics overall. At AWS, the vast majority of our roadmap is driven by working closely with our customers, understanding their requests and priorities and bringing them into our services. Mert, Aaron, and Anthony are customer-obsessed, always looking for ways to help customers get more from Athena, and they have an innate ability to teach and bring people along. I'm so grateful they chose to write this book to share their expertise with all of us.

This book, like Amazon Athena, is designed to get you up and running with queries with minimal upfront setup and work. You'll progress from running simple queries to building sophisticated, automated pipelines to work with near-real-time event data, queries to external data sources, custom functions, and more, all while learning from Mert, Aaron, and Anthony's experience working with real-world customer scenarios.

I highly recommend this book to any new or existing customers looking to transform their business with data and with Amazon Athena.

Rahul Pathak, VP, AWS Analytics

Contributors
About the authors

Anthony Virtuoso works as a principal engineer at Amazon and holds multiple patents in distributed systems, software-defined networks, and security. In his 8 years at Amazon, he has helped launch several Amazon web services, the most recent of which was Amazon Managed Blockchain. As one of the original authors of Athena Query Federation, you'll often find him lurking on the Athena Federation GitHub repository answering questions and shipping bug fixes. When not at work, Anthony obsesses over a different set of customers, namely his wife and two little boys, aged 2 and 5. His kids enjoy doing science experiments with their dad, such as 3D printing toys, building with LEGO, or searching the local pond for tardigrades.

Mert Turkay Hocanin is a principal big data architect at Amazon Web Services within the AWS Glue and AWS Lake Formation services and has previously worked for several other services, including Amazon Athena, Amazon EMR, and Amazon Managed Blockchain. During his time at AWS, he has worked with several Fortune 500 companies on some of the largest data lakes in the world and was involved with the launching of three Amazon web services. Prior to being a big data architect, he was a senior software developer within Amazon's retail systems organization, building one of the earliest data lakes in the company in 2013. When he is not helping customers build data lakes, he enjoys spending time with his wife, Subrina, and son, Tristan, and exploring New York City.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure»

Look at similar books to Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure»

Discussion, reviews of the book Serverless Analytics with Amazon Athena: Query structured, unstructured, or semi-structured data in seconds without setting up any infrastructure and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.