• Complain

Sean T. Allen - Storm Applied: Strategies for real-time event processing

Here you can read online Sean T. Allen - Storm Applied: Strategies for real-time event processing full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2015, publisher: Manning Publications, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Sean T. Allen Storm Applied: Strategies for real-time event processing

Storm Applied: Strategies for real-time event processing: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Storm Applied: Strategies for real-time event processing" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Summary

Storm Applied is a practical guide to using Apache Storm for the real-world tasks associated with processing and analyzing real-time data streams. This immediately useful book starts by building a solid foundation of Storm essentials so that you learn how to think about designing Storm solutions the right way from day one. But it quickly dives into real-world case studies that will bring the novice up to speed with productionizing Storm.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

Summary

Storm Applied is a practical guide to using Apache Storm for the real-world tasks associated with processing and analyzing real-time data streams. This immediately useful book starts by building a solid foundation of Storm essentials so that you learn how to think about designing Storm solutions the right way from day one. But it quickly dives into real-world case studies that will bring the novice up to speed with productionizing Storm.

About the Technology

Its hard to make sense out of data when its coming at you fast. Like Hadoop, Storm processes large amounts of data but it does it reliably and in real time, guaranteeing that every message will be processed. Storm allows you to scale with your data as it grows, making it an excellent platform to solve your big data problems.

About the Book

Storm Applied is an example-driven guide to processing and analyzing real-time data streams. This immediately useful book starts by teaching you how to design Storm solutions the right way. Then, it quickly dives into real-world case studies that show you how to scale a high-throughput stream processor, ensure smooth operation within a production cluster, and more. Along the way, youll learn to use Trident for stateful stream processing, along with other tools from the Storm ecosystem.

This book moves through the basics quickly. While prior experience with Storm is not assumed, some experience with big data and real-time systems is helpful.

Whats Inside

  • Mapping real problems to Storm components
  • Performance tuning and scaling
  • Practical troubleshooting and debugging
  • Exactly-once processing with Trident

About the Authors

Sean Allen, Matthew Jankowski, and Peter Pathirana lead the development team for a high-volume, search-intensive commercial web application at TheLadders.

Table of Contents

  1. Introducing Storm
  2. Core Storm concepts
  3. Topology design
  4. Creating robust topologies
  5. Moving from local to remote topologies
  6. Tuning in Storm
  7. Resource contention
  8. Storm internals
  9. Trident

Sean T. Allen: author's other books


Who wrote Storm Applied: Strategies for real-time event processing? Find out the surname, the name of the author of the book and a list of all author's works by series.

Storm Applied: Strategies for real-time event processing — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Storm Applied: Strategies for real-time event processing" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Storm Applied: Strategies for real-time event processing
Sean T. Allen, Matthew Jankowski, and Peter Pathirana

Storm Applied Strategies for real-time event processing - image 1

Copyright

For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact

Special Sales DepartmentManning Publications Co.20 Baldwin RoadPO Box 761Shelter Island, NY 11964Email: orders@manning.com

2015 by Manning Publications Co. All rights reserved.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

Picture 2 Recognizing the importance of preserving what has been written, it is Mannings policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.

Picture 3Manning Publications Co.20 Baldwin RoadPO Box 761Shelter Island, NY 11964Development editor: Dan MaharryTechnical development editor Aaron ColcordCopyeditor: Elizabeth WelchProofreader: Melody DolabTechnical proofreader: Michael RoseTypesetter: Dennis DalinnikCover designer: Marija Tudor

ISBN: 9781617291890

Printed in the United States of America

1 2 3 4 5 6 7 8 9 10 EBM 20 19 18 17 16 15

Brief Table of Contents
Table of Contents
Foreword

Backend rewrites are always hard.

Thats how ours began, with a simple statement from my brilliant and trusted colleague, Keith Bourgoin. We had been working on the original web analytics backend behind Parse.ly for over a year. We called it PTrack.

Parse.ly uses Python, so we built our systems atop comfortable distributed computing tools that were handy in that community, such as multiprocessing and celery. Despite our mastery of these, it seemed like every three months, wed double the amount of traffic we had to handle and hit some other limitation of those systems. There had to be a better way.

So, we started the much-feared backend rewrite. This new scheme to process our data would use small Python processes that communicated via ZeroMQ. We jokingly called it PTrack3000, referring to the Python3000 name given to the future version of Python by the languages creator, when it was still a far-off pipe dream.

By using ZeroMQ, we thought we could squeeze more messages per second out of each process and keep the system operationally simple. But what this setup gained in operational ease and performance, it lost in data reliability.

Then, something magical happened. BackType, a startup whose progress we had tracked in the popular press,[] was acquired by Twitter. One of the first orders of business upon being acquired was to publicly release its stream processing framework, Storm, to the world.

This article, Secrets of BackTypes Data Engineers (2011), was passed around my team for a while before Storm was released: http://readwrite.com/2011/01/12/secrets-of-backtypes-data-engineers.

My colleague Keith studied the documentation and code in detail, and realized: Storm was exactly what we needed!

It even used ZeroMQ internally (at the time) and layered on other tooling for easy parallel processing, hassle-free operations, and an extremely clever data reliability model. Though it was written in Java, it included some documentation and examples for making other languages, like Python, play nicely with the framework. So, with much glee, PTrack9000! (exclamation point required) was born: a new Parse.ly analytics backend powered by Storm.

Nathan Marz, Storms original creator, spent some time cultivating the community via conferences, blog posts, and user forums.[] But in those early days of the project, you had to scrape tiny morsels of Storm knowledge from the vast web.

Nathan Marz wrote this blog post about his early efforts at evangelizing the project in History of Apache Storm and lessons learned (2014): http://nathanmarz.com/blog/history-of-apache-storm-and-lessons-learned.html.

Oh, how I wish Storm Applied, the book youre currently reading, had already been written in 2011. Although Storms documentation on its design rationale was very strong, there were no practical guides on making use of Storm (especially in a production setting) when we adopted it. Frustratingly, despite a surge of popularity over the next three years, there were still no good books on the subject through the end of 2014!

No one had put in the significant effort required to detail how Storm components worked, how Storm code should be written, how to tune topology performance, and how to operate these clusters in the real world. That is, until now. Sean, Matthew, and Peter decided to write Storm Applied by leveraging their hard-earned production experience at TheLadders, and it shows. This will, no doubt, become the definitive practitioners guide for Storm users everywhere.

Through their clear prose, illuminating diagrams, and practical code examples, youll gain as much Storm knowledge in a few short days as it took my team several years to acquire. You will save yourself many stressful firefights, head-scratching moments, and painful code re-architectures.

Im convinced that with the newfound understanding provided by this book, the next time a colleague turns to you and says, Backend rewrites are always hard, youll be able to respond with confidence: Not this time.

Happy hacking!

A NDREW M ONTALENTI

C OFOUNDER & CTO, P ARSE.LY []

Parse.lys web analytics system for digital storytellers is powered by Storm: http://parse.ly.

C REATOR OF STREAMPARSE , A P YTHON PACKAGE FOR S TORM []

To use Storm with Python, you can find the streamparse project on Github: https://github.com/Parsely/streamparse.

Preface

At TheLadders, weve been using Storm since it was introduced to the world (version 0.5.x). In those early days, we implemented solutions with Storm that supported noncritical business processes. Our Storm cluster ran uninterrupted for a long time and just worked. Little attention was paid to this cluster, as it never really had any problems. It wasnt until we started identifying more business cases where Storm was a good fit that we started to experience problems. Contention for resources in production, not having a great understanding of how things were working under the covers, sub-optimal performance, and a lack of visibility into the overall health of the system were all issues we struggled with.

This prompted us to focus a lot of time and effort on learning much of what we present in this book. We started with gaining a solid understanding of the fundamentals of Storm, which included reading (and rereading many times) the existing Storm documentation, while also digging into the source code. We then identified some best practices for how we liked to design solutions using Storm. We added better monitoring, which enabled us to troubleshoot and tune our solutions in a much more efficient manner.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Storm Applied: Strategies for real-time event processing»

Look at similar books to Storm Applied: Strategies for real-time event processing. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Storm Applied: Strategies for real-time event processing»

Discussion, reviews of the book Storm Applied: Strategies for real-time event processing and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.