• Complain

Jun Xu - Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools

Here you can read online Jun Xu - Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2019, publisher: Apress, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Jun Xu Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools
  • Book:
    Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools
  • Author:
  • Publisher:
    Apress
  • Genre:
  • Year:
    2019
  • Rating:
    4 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 80
    • 1
    • 2
    • 3
    • 4
    • 5

Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Understand the fundamental factors of data storage system performance and master an essential analytical skill using block trace via applications such as MATLAB and Python tools. You will increase your productivity and learn the best techniques for doing specific tasks (such as analyzing the IO pattern in a quantitative way, identifying the storage system bottleneck, and designing the cache policy).

In the new era of IoT, big data, and cloud systems, better performance and higher density of storage systems has become crucial. To increase data storage density, new techniques have evolved and hybrid and parallel access techniquestogether with specially designed IO scheduling and data migration algorithmsare being deployed to develop high-performance data storage solutions. Among the various storage system performance analysis techniques, IO event trace analysis (block-level trace analysis particularly) is one of the most common approaches for system optimization and design. However, the task of completing a systematic survey is challenging and very few works on this topic exist.

Block Trace Analysis and Storage System Optimization brings together theoretical analysis (such as IO qualitative properties and quantitative metrics) and practical tools (such as trace parsing, analysis, and results reporting perspectives). The book provides content on block-level trace analysis techniques, and includes case studies to illustrate how these techniques and tools can be applied in real applications (such as SSHD, RAID, Hadoop, and Ceph systems).

What Youll Learn

  • Understand the fundamental factors of data storage system performance

  • Master an essential analytical skill using block trace via various applications

  • Distinguish how the IO pattern differs in the block level from the file level

  • Know how the sequential HDFS request becomes fragmented in final storage devices

  • Perform trace analysis tasks with a tool based on the MATLAB and Python platforms

Who This Book Is For

IT professionals interested in storage system performance optimization: network administrators, data storage managers, data storage engineers, storage network engineers, systems engineers

Jun Xu: author's other books


Who wrote Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools? Find out the surname, the name of the author of the book and a list of all author's works by series.

Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Contents
Landmarks
Jun Xu Block Trace Analysis and Storage System Optimization A Practical - photo 1
Jun Xu
Block Trace Analysis and Storage System Optimization A Practical Approach with MATLAB/Python Tools
Jun Xu Singapore Singapore Any source code or other supplementary material - photo 2
Jun Xu
Singapore, Singapore

Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub via the books product page, located at www.apress.com/9781484239278 . For more detailed information, please visit www.apress.com/source-code .

ISBN 978-1-4842-3927-8 e-ISBN 978-1-4842-3928-5
https://doi.org/10.1007/978-1-4842-3928-5
Library of Congress Control Number: 2018964058
Jun Xu 2018
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.
Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.

To Grace, Alexander, and Arthur.

Introduction

In the new era of IoT, big data, and cloud systems, better performance and higher density of storage systems become more crucial in many applications.

To increase data storage density, new techniques have evolved, including shingled magnetic recording (SMR), heat-assistant magnetic recording (HAMR) for HDD, 3D Phase Change Memory (PCM) and Resistive RAM (ReRAM) for SSD. Furthermore, some hybrid and parallel access techniques together with specially designed IO scheduling and data migration algorithms have been deployed to develop high performance data storage solutions.

Among the various storage system performance analysis techniques, IO event trace analysis (block-level trace analysis in particular) is one of the most common approaches for system optimization and design. However, the task of completing a systematic survey is challenging and very few works on this topic exist. Some books provide theoretical fundamentals without enough practical analysis in physical systems, and others discuss the performance of some specific storage systems without proposing a tool that can be applied widely.

To fill this gap, this book brings together IO properties and metrics, trace parsing, and result reporting perspectives, based on MATLAB and Python platforms. It provides self-inclusive content on block-level trace analysis techniques, and it includes typical case studies to illustrate how these techniques and tools can be applied in real applications such as SSHD, RAID, Hadoop, and Ceph systems.

This book starts with an introduction in Chapter , which provides the background of data storage systems and general trace analysis. I show that the wide applications of block storage devices motivate the intensive study of various block-level workload properties.

Chapter gives an overview of traces, in particular, the block-level traces. After introducing the common workload properties, I discuss the trace metrics in two categories, the basic ones and the advanced ones.

In Chapter , I present the ways to collect the block-level trace in both hardware and software tools. In particular, I show how the most popular tool in Linux system, blktrace, works in a simple setting.

In Chapter , I investigate the design of trace analyzers. I discuss the interactions of the workload with system components, algorithms, structure, and applications.

Case study is the best way to learn the methodology and the corresponding tools. This book will provide some examples to show how the analysis can be applied to real storage system tuning, optimization, and design. Therefore, from Chapter , I provide some typical examples for trace analysis and system optimization.

Chapter presents the properties of traces from some benchmark tools, such as SPC and PCMarks. I show how to capture the main characteristics and then formulate a synthetic trace generator. I also show how the cache is affected by the workload, and how a proper scheduling algorithm is designed.

Chapter attempts to explain the mystery behind SSHDs performance boost in SPC-1C under WCD (write cache disabled). I show from the trace how a new hybrid structure can help to improve system performance.

Chapter discusses the trace under two RAID systems with different read and write properties. I illustrate that the parity structure has a big impact on the overall performance.

Chapter first reviews the literature on Hadoop workload analysis. And then I discuss the WD Hadoop cluster in a production environment. After that, the workload properties are analyzed, in particular, for SMR drives.

Chapter analyzes the Ceph system performance. Storage and the CPU/network/memory are discussed. I show that these components shall be considered as a unified system in order to identify the performance bottleneck.

The tools used in the book are introduced in the appendix. I first introduce the tool based on MATLAB. Then, I show how this tool is converted into the Python platform.

Acknowledgments

A major component of this work came as a result of my 16 years of R&D experience on data analytics and storage systems at Western Digital, Temasek Labs, and Data Storage Institute. I would like to acknowledge Western Digital for allowing me to publish some of my job-related work. During the preparation of this book, I received support and advice from many friends and colleagues. Here I only mention few: Dr. Jie Yu, Dr. Guoxiao Guo, Robin ONeill, Grant Mackey, Dr. Jianyi Wang, David Chan, Wai-Ee Wong, Dr. Yi Li, Samuel Torrez, Shihua Feng, Jiang Dan, Terry Wu, Allen Samuels, Gregory Thelin, William Boyle, David Hamilton, John Clinton, Nils Larson, Karanvir Singh, Eric Lee, and Sang Huynh. In particular, Junpeng Niu, my PhD student and colleague, also helped me with a few paragraphs in Chapter 1 on hybrid disks.

I would also like to thank the technical reviewers, Yunpeng Cai and Li Xia, for their very helpful comments. Deep appreciation also goes out to the editors, Susan McDermott, Rita Fernando, Laura Berendson, Amrita Stanley, Krishnan Sathyamurthy and Joseph Quatela for their hard work.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools»

Look at similar books to Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools»

Discussion, reviews of the book Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.