• Complain

James Jeffers - Intel Xeon Phi Coprocessor High Performance Programming

Here you can read online James Jeffers - Intel Xeon Phi Coprocessor High Performance Programming full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2013, publisher: Morgan Kaufmann, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

No cover
  • Book:
    Intel Xeon Phi Coprocessor High Performance Programming
  • Author:
  • Publisher:
    Morgan Kaufmann
  • Genre:
  • Year:
    2013
  • Rating:
    4 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 80
    • 1
    • 2
    • 3
    • 4
    • 5

Intel Xeon Phi Coprocessor High Performance Programming: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Intel Xeon Phi Coprocessor High Performance Programming" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Authors Jim Jeffers and James Reinders spent two years helping educate customers about the prototype and pre-production hardware before Intel introduced the first Intel Xeon Phi coprocessor. They have distilled their own experiences coupled with insights from many expert customers, Intel Field Engineers, Application Engineers and Technical Consulting Engineers, to create this authoritative first book on the essentials of programming for this new architecture and these new products.This book is useful even before you ever touch a system with an Intel Xeon Phi coprocessor. To ensure that your applications run at maximum efficiency, the authors emphasize key techniques for programming any modern parallel computing system whether based on Intel Xeon processors, Intel Xeon Phi coprocessors, or other high performance microprocessors. Applying these techniques will generally increase your program performance on any system, and better prepare you for Intel Xeon Phi coprocessors and the Intel MIC architecture.* A practical guide to the essentials of the Intel Xeon Phi coprocessor* Presents best practices for portable, high-performance computing and a familiar and proven threaded, scalar-vector programming model* Includes simple but informative code examples that explain the unique aspects of this new highly parallel and high performance computational product* Covers wide vectors, many cores, many threads and high bandwidth cache/memory architecture

James Jeffers: author's other books


Who wrote Intel Xeon Phi Coprocessor High Performance Programming? Find out the surname, the name of the author of the book and a list of all author's works by series.

Intel Xeon Phi Coprocessor High Performance Programming — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Intel Xeon Phi Coprocessor High Performance Programming" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Intel Xeon Phi Coprocessor High-Performance Programming Jim Jeffers James - photo 1
Intel Xeon Phi Coprocessor High-Performance Programming

Jim Jeffers

James Reinders

Table of Contents Copyright Acquiring Editor Todd Green Development Editor - photo 2

Table of Contents
Copyright

Acquiring Editor: Todd Green

Development Editor: Lindsay Lawrence

Project Manager: Mohanambal Natarajan

Designer: Mark Rogers

Morgan Kaufmann is an imprint of Elsevier

225 Wyman Street, Waltham, MA 02451, USA

Copyright 2013 James R. Reinders and James L. Jeffers. Published by Elsevier Inc. All rights reserved

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publishers permissions policies and arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices

Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods or professional practices, or medical treatment may become necessary.

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data

Application submitted

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-12-410414-3

For information on all Morgan Kaufmann publications visit our website at www.mkp.com

Printed in the United States of America

13 14 15 16 17 10 9 8 7 6 5 4 3 2 1

Foreword Robert J Harrison Institute for Advanced Computational Science - photo 3

Foreword

Robert J. Harrison, Institute for Advanced Computational Science, Stony Brook University

I cannot think of a more exciting (that is to say, tumultuous) era in high performance computing since the introduction in the mid-80s of massively parallel computers such as the Intel iPSC and nCUBE, followed by the IBM SP and Beowulf clusters. But now, instead of benefiting only high-performance applications parallelized with MPI or other distributed memory models, the revolution is happening within a node and is benefiting everyone, whether running a laptop or multi-petaFLOP/s supercomputer. In sharp contrast to the GPGPUs that fired our collective imagination in the last decade, the IntelXeon Phi product family (its first product version already being one teraFLOP/s double-precision peak speed!) brings supercomputer performance right into everyones office while employing standard programing tools fully compatible with the desktop environment including a full suite of numerical software and the entire GNU/Linux stack. With both architectural and software unity from multi-core Intel Architecture processors to many-core Intel Xeon Phi products we have for the first time a holistic path for portable, high-performance computing that is based upon a familiar and proven threaded, scalar-vector programming model. And Intels vision for Intel Many Integrated Core (Intel MIC) architecture takes us from the early petaFLOP era in 2012 into the exaFLOP era in 2020indeed, the Intel Xeon Phi coprocessor based on Intel MIC architecture is credibly a glimpse into that future.

So whats the catch? Theres actually no new news heresequential (more specifically singled-threaded and non-vectorized) computation is dead even in the desktop. Long dead. Pipelined functional units, multiple instruction issue, SIMD extensions, and multi-core architectures killed that years ago. But if you have one of the 99 percent of applications that are not yet both multi-threaded and vectorized, then on multicore Intel Xeon with AVX SIMD units you could be missing a factor of up to 100x in performance, and the highly-threaded Intel MIC architecture implies a factor of up to 1000x. Yes, you are reading those numbers correctly. A scalar, single-threaded application, depending on what is limiting its performance, could be leaving several orders of magnitude in single-socket performance on the table. All modern processors, whether CPU or GPGPU, require very large amounts of parallelism to attain high performance. Some good news on Intel Xeon Phi coprocessors is that with your application running out of the box thanks to the standard programming environment (again in contrast to GPGPUs that require recoding even to run), you can use familiar tools to analyze performance and have a robust path for incremental transformations to optimize the code, and those optimizations will carry directly over to mainstream processors. But how to optimize the code? Which algorithms, data structures, numerical representations, loop constructs, languages, compilers, and so on, are a good match for Intel Xeon Phi products? And how to do all of this in a way that is not necessarily specific to the current Intel MIC architecture but instead positions you for future and even non-Intel architectures? This book generously and accessibly puts the answer to all of these questions and more into your hands.

A critical point is that early in the killer-micro revolution that replaced custom vector processors with commodity CPUs, application developers ceased to develop vectorized algorithms because the first generations of these CPUs could indeed attain a good fraction of peak performance on operation-rich sequential code. Fast forward two decades and this is now far from true but the passage of time has erased from our collective memory much of the wisdom and folklore of the vectorized algorithms so successful on Cray and other vector computers. However, the success of that era should give us great confidence that a multi-threaded, scalar-vector programming model supported by a rich vector instruction set is a great match for a very broad range of algorithms and disciplines. Needless to say, there are new challenges such as the deeper and more complex memory hierarchy of modern processors, an order of magnitude more threads, the lack of true hardware gather/scatter, and compilers still catching up with (rediscovering?) what was possible 25 years ago.

In October 2010, I gave a talk entitled DSLs, Vectors, and Amnesia, at an excellent workshop on language tools in Houston organized by John Mellor-Crummey. Amnesia referred to the loss of vectorization capabilities mentioned above. I used the example of rationalizing the bizarrely large speedups claimed in the early days of using GPGPUs as a platform to identify successes and failures in mainstream programming. Inconsistent optimization of applications on the two platforms is the trivial explanation, with the GPGPU realizing a much higher fraction of its peak speed. But why was this so, and what was to learn from this about writing code with portable performance? There were two reasons underlying the performance discrepancy. First, the data-parallel programming languages such as OpenCL and NVidias CUDA forced programmers to write massively data-parallel code that, with a good compiler and some tuning of vector lengths and data layout, perfectly matched the underlying hardware and realized high performance. Second, the comparison was usually made against a scalar, non-threaded x86 code that we now understand to be far from optimal. The universal solution was to back-port the GPGPU code to the multi-core CPU with retuning for the different numbers of cores and vector/cache sizesindeed with care and some luck the same code base could serve on both platforms (and certainly so if programming with OpenCL). All of the optimizations for locality, bandwidth, concurrency, and vectorization carry straight over, and the cleaner, simplified, dependency-free code is more readily analyzed by compilers. Thus, there is every reason to expect that nearly all algorithms that work well on current GPGPUs can with a minimal amount of restructuring execute equally well on the Intel Xeon Phi coprocessor, and that algorithms requiring fine-grain concurrent control should be significantly easier to express on the coprocessor than on GPGPU.

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Intel Xeon Phi Coprocessor High Performance Programming»

Look at similar books to Intel Xeon Phi Coprocessor High Performance Programming. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Intel Xeon Phi Coprocessor High Performance Programming»

Discussion, reviews of the book Intel Xeon Phi Coprocessor High Performance Programming and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.