• Complain

Adi Polak - Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch

Here you can read online Adi Polak - Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2023, publisher: OReilly Media, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Adi Polak Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch
  • Book:
    Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch
  • Author:
  • Publisher:
    OReilly Media
  • Genre:
  • Year:
    2023
  • Rating:
    4 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 80
    • 1
    • 2
    • 3
    • 4
    • 5

Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Learn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede todays traditional methods. Youll learn a more holistic approach that takes you beyond specific requirements and organizational goalsallowing data and ML practitioners to collaborate and understand each other better.Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If youre a data scientist who works with machine learning, this book shows you when and why to use each technology.You will Explore machine learning, including distributed computing concepts and terminology Manage the ML lifecycle with MLflow Ingest data and perform basic preprocessing with Spark Explore feature engineering, and use Spark to extract features Train a model with MLlib and build a pipeline to reproduce it Build a data system to combine the power of Spark with deep learning Get a step-by-step example of working with distributed TensorFlow Use PyTorch to scale machine learning and its internal architecture

Adi Polak: author's other books


Who wrote Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch? Find out the surname, the name of the author of the book and a list of all author's works by series.

Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Praise for Scaling Machine Learning with Spark

If there is one book the Spark community has been craving for the last decade, its this. Writing about the combination of Spark and AI requires broad knowledge, a deep technical skillset, and the ability to break down complex concepts so theyre easy to understand. Adi delivers all of this and more while covering big data, AI, and everything in between .

Andy Petrella, founder at Kensu and author of Fundamentals of Data Observability (OReilly)

Scaling Machine Learning with Spark is a wealth of knowledge for data and ML practitioners, providing a holistic and creative approach to building end-to-end scalable machine learning solutions. The authors expertise and knowledge, combined with a focus on collaboration and understanding, makes this book a must-read for anyone in the industry .

Noah Gift, Duke executive in residence

Adis book is without any doubt a good reference and resource to have beside you when working with Spark and distributed ML. You will learn best practices she has to share along with her experience working in the industry for many years. Worth the investment and time reading it.

Laura Uzcategui, machine learning engineer at TalentBait

This book is an amazing synthesis of knowledge and experience. I consider it essential reading for both novice and veteran machine learning engineers. Readers will deepen their understanding of general principles for machine learning in distributed systems while simultaneously engaging with the technical details required to integrate and scale the most widely used tools of the trade including Spark, PyTorch, Tensorflow.

Matthew Housley, CTO and coauthor of Fundamentals of Data Engineering (OReilly)

Adis done a wonderful job at creating a very readable, practical, and insanely detailed deep dive into machine learning with Spark.

Joe Reis, coauthor of Fundamentals of Data Engineering (OReilly) and recovering data scientist

Scaling Machine Learning with Spark

by Adi Polak

Copyright 2023 Adi Polak. All rights reserved.

Printed in the United States of America.

Published by OReilly Media, Inc. , 1005 Gravenstein Highway North, Sebastopol, CA 95472.

OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (https://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com.

  • Acquisitions Editor: Nicole Butterfield
  • Development Editor: Jill Leonard
  • Production Editor: Jonathon Owen
  • Copyeditor: Rachel Head
  • Proofreader: Piper Editorial Consulting, LLC
  • Indexer: Potomac Indexing, LLC
  • Interior Designer: David Futato
  • Cover Designer: Karen Montgomery
  • Illustrator: Kate Dullea
  • March 2023: First Edition
Revision History for the First Edition
  • 2023-03-02: First Release

See http://oreilly.com/catalog/errata.csp?isbn=9781098106829 for release details.

The OReilly logo is a registered trademark of OReilly Media, Inc. Scaling Machine Learning with Spark, the cover image, and related trade dress are trademarks of OReilly Media, Inc.

The views expressed in this work are those of the author and do not represent the publishers views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.

978-1-098-10682-9

[LSI]

Preface

Welcome to Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch. This book aims to guide you in your journey as you learn more about machine learning (ML) systems. Apache Spark is currently the most popular framework for large-scale data processing. It has numerous APIs implemented in Python, Java, and Scala and is used by many powerhouse companies, including Netflix, Microsoft, and Apple. PyTorch and TensorFlow are among the most popular frameworks for machine learning. Combining these tools, which are already in use in many organizations today, allows you to take full advantage of their strengths.

Before we get started, though, perhaps you are wondering why I decided to write this book. Good question. There are two reasons. The first is to support the machine learning ecosystem and community by sharing the knowledge, experience, and expertise I have accumulated over the last decade working as a machine learning algorithm researcher, designing and implementing algorithms to run on large-scale data. I have spent most of my career working as a data infrastructure engineer, building infrastructure for large-scale analytics with all sorts of formatting, types, schemas, etc., and integrating knowledge collected from customers, community members, and colleagues who have shared their experience while brainstorming and developing solutions. Our industry can use such knowledge to propel itself forward at a faster rate, by leveraging the expertise of others. While not all of this books content will be applicable to everyone, much of it will open up new approaches for a wide array of practitioners.

This brings me to my second reason for writing this book: I want to provide a holistic approach to building end-to-end scalable machine learning solutions that extends beyond the traditional approach. Today, many solutions are customized to the specific requirements of the organization and specific business goals. This will most likely continue to be the industry norm for many years to come. In this book, I aim to challenge the status quo and inspire more creative solutions while explaining the pros and cons of multiple approaches and tools, enabling you to leverage whichever tools are used in your organization and get the best of all worlds. My overall goal is to make it simpler for data and machine learning practitioners to collaborate and understand each other better.

Who Should Read This Book?

This book is designed for machine learning practitioners with previous industry experience who want to learn about Apache Sparks MLlib and increase their understanding of the overall system and flow. It will be particularly relevant to data scientists and machine learning engineers, but MLOps engineers, software engineers, and anyone interested in learning about or building distributed machine learning models and building pipelines with MLlib, distributed PyTorch, and TensorFlow will also find value. Technologists who understand high-level concepts of working with machine learning and want to dip their feet into the technical side as well should also find the book interesting and accessible.

Do You Need Distributed Machine Learning?

As with every good thing, it depends. If you have small datasets that fit into your machines memory, the answer is no. If at some point you will need to scale out your code and make sure you can train a model on a larger dataset that does not fit into a single machines memory, then yes.

It is often better to use the same tools across the software development lifecycle, from the local development environment to staging and production. Take into consideration, though, that this also introduces other complexities involved in managing a distributed system, which typically will be handled by a different team in your organization. Its a good idea to have a common language to collaborate with your colleagues.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch»

Look at similar books to Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch»

Discussion, reviews of the book Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.