• Complain

Richard Nuckolls - Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08

Here you can read online Richard Nuckolls - Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08 full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2020, publisher: Manning Publications Co., genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Richard Nuckolls Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08
  • Book:
    Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08
  • Author:
  • Publisher:
    Manning Publications Co.
  • Genre:
  • Year:
    2020
  • Rating:
    3 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 60
    • 1
    • 2
    • 3
    • 4
    • 5

Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Richard Nuckolls: author's other books


Who wrote Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08? Find out the surname, the name of the author of the book and a list of all author's works by series.

Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08 — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
MEAP VERSION 8 About this MEAP You can download the most up-to-date version - photo 1
MEAP VERSION 8
About this MEAP You can download the most up-to-date version of your electronic - photo 2
About this MEAP

You can download the most up-to-date version of your electronic books from your Manning Account at .

Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.

https://livebook.manning.com/#!/book/azure-data-engineering/discussion

Welcome

Thanks for purchasing the MEAP for Azure Data Engineering: Real-time, streaming, and batch analytics.

Ive spent my career working with Microsoft technologies. Chances are you have too. When my company decided to build a new analytics system, Microsoft Azure was a natural fit. But all the "Big Data" systems use Apache this and open-source that. After years of building PaaS apps in Azure, endless configuration and scattered single-use tools looks like work.

Luckily, Azure offers a set of PaaS services for building high-capacity analytics systems with the easy integration and familiar tool sets we want. This book covers these building block capabilities: storage, ingestion, stream processing, batch processing, querying, and automation. The chapters cover specific services, including Azure Storage, Data Lake, Event Hub, Data Lake Analytics, Stream Analytics, SQL Data Warehouse, and Data Factory. Examples in each chapter add pieces to build a working analytics system, and teach the fundamentals of using each technology. Since these are Microsoft technologies, youll get to choose between GUI and command-line.

To get the most benefit from this book, youll need established skill in writing SQL queries and managing SQL databases. Youll want some basic coding skills too, so Powershell scripts and short C# methods wont be unmanageable. Finally, youll need a subscription to Azure to follow along with the examples. But dont worry, the examples wont cost more than a few cups of coffee.

Please leave questions and comments in the Mannings liveBook's Discussion Forum. Tell me when youve lost the thread. After working with these services so long, its easy to forget how the breadth of options challenge the newcomer. You can help me introduce a new group of data engineers to Azure!

Richard Nuckolls

Azure Data Engineering Real-time streaming and batch analytics MEAP V08 - image 3
Brief Table of Contents

Distributed SQL with Azure SQL Data Warehouse

Data movement in Azure SQL Data Warehouse


What is data engineering

This chapter covers:

  • What is data engineering?

  • What do data engineers do?

  • How does Microsoft define data engineering?

  • What tools does Azure provide for data engineering?

Data collection is on the rise. More and more systems are generating more and more data every day. According to Marz,

More than 30,000 gigabytes of data are generated every second, and the rate of data creation is only accelerating.

-- Nathan Marz Big Data (2)

Increased connectivity has led to increased sophistication and user interaction in software systems. New deployments of connected "smart" electronics also relied on increased connectivity. In response, businesses have increased the amount of data being collected, stored, and aggregated by their products. This has led to an enormous increase in compute and storage infrastructure. The scale of data collection and processing requires a change in strategy for processing the data. Businesses are challenged to find experienced engineers and programmers to develop the systems and processes to handle this data. The new role of data engineer has evolved to fill this need. The collection, preparation and querying of this mountain of data using Azure services is the subject of this book. The reader will be able to build working data analytics systems in Azure after completing the book.

According to Gartner,

Big Data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery, and process optimization.

-- Mark A. Beyer The Importance of Big Data
1.1 What is data engineering?

Collecting the data seems like a simple activity. Take reporting web site traffic. A single user, during a site in a web browser, requests a page. A simple site could respond with an HTML file, a CSS file, and an image. This example could represent one event, three events, or four events. What if there is a page redirect? That is another event. What if we want to log the time taken to query a database? What if we retrieve some items from cache? All of these pieces of data are commonly logged data points today.

Now add more user interaction, like a comparison page with multiple sliders. Each move of the slide logs a value. Tracking user mouse movement returns hundreds of coordinates. Consider a connected sensor with a 100Hz sample rate. It can easily record over 8 million measurements a day. When you start to scale to thousands and tens of thousands of simultaneous events every point in the pipeline must be optimized for speed until the data comes to rest. Once at rest the data must remain secure.

Data engineering is the practice of building data storage and processing systems. Robert Chang, in his "A Beginners Guide to Data Engineering," describes the work as designing, building, and maintaining data warehouses. Data engineering creates scalable systems which allow analysts and data scientists to extract meaningful information from the data.

1.2 What do data engineers do?

Most businesses have multiple sources generating data. Manufacturing companies track the output of the machines, the output of employees, the output of their shipping departments. Software companies track their user actions, their software bugs per release, their developer output per day. Service companies check number of sales calls, time to complete tasks, usage of parts stores, and cost per lead. Some of this is small scale; some of it is large scale.

Analysts and managers might operate on narrow data sets, but large enterprises increasing want to find efficiencies across divisions, or find root causes in multi-faceted systems failures. In order to extract value from these disparate sources of data, engineers have built large scale storage systems as a single repository of data. A software company may implement centralized error logging. The service company may integrate their CRM, billing, and finance systems. Engineers need to plan for the ingestion pipeline, the storage Systems, and reporting services across multiple groups of stakeholders.

As a first step data consolidation means a large relational database. Analysts review reports, CSV files, and even spreadsheets in Excel, in an attempt to get clean and consistent data. Often developers or database administrators prepare scripts to import the data into databases. In the best case experienced database administrators define common schema and plan partioning and indexing. The database enters production. Data collection commences in earnest.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08»

Look at similar books to Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08»

Discussion, reviews of the book Azure Data Engineering: Real-time, streaming, and batch analytics MEAP V08 and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.