Jenny Kim - Interactive Spark using PySpark
Here you can read online Jenny Kim - Interactive Spark using PySpark full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2016, publisher: OReilly Media, Inc., genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:
Romance novel
Science fiction
Adventure
Detective
Science
History
Home and family
Prose
Art
Politics
Computer
Non-fiction
Religion
Business
Children
Humor
Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.
- Book:Interactive Spark using PySpark
- Author:
- Publisher:OReilly Media, Inc.
- Genre:
- Year:2016
- Rating:3 / 5
- Favourites:Add to favourites
- Your mark:
Interactive Spark using PySpark: summary, description and annotation
We offer to read an annotation, description, summary or preface (depends on what the author of the book "Interactive Spark using PySpark" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.
Abstract: Apache Spark is an in-memory framework that allows data scientists to explore and interact with big data much more quickly than with Hadoop. Python users can work with Spark using an interactive shell called PySpark. Why is it important? PySpark makes the large-scale data processing capabilities of Apache Spark accessible to data scientists who are more familiar with Python than Scala or Java. This also allows for reuse of a wide variety of Python libraries for machine learning, data visualization, numerical analysis, etc. What youll learnand how you can apply it Compare the different components provided by Spark, and what use cases they fit. Learn how to use RDDs (resilient distributed datasets) with PySpark. Write Spark applications in Python and submit them to the cluster as Spark jobs. Get an introduction to the Spark computing framework. Apply this approach to a worked example to determine the most frequent airline delays in a specific month and year. This lesson is for you because Youre a data scientist, familiar with Python coding, who needs to get up and running with PySpark Youre a Python developer who needs to leverage the distributed computing resources available on a Hadoop cluster, without learning Java or Scala first Prerequisites Familiarity with writing Python applications Some familiarity with bash command-line operations Basic understanding of how to use simple functional programming constructs in Python, such as closures, lambdas, maps, etc. Materials or downloads needed in advance Apache Spark This lesson is taken from Data Analytics with Hadoop by Jenny Kim and Benjamin Bengfort
Jenny Kim: author's other books
Who wrote Interactive Spark using PySpark? Find out the surname, the name of the author of the book and a list of all author's works by series.