LitArk » Books » Home and family

Kaeli David R. - Heterogeneous Computing with OpenCL 2.0

Here you can read online Kaeli David R. - Heterogeneous Computing with OpenCL 2.0 full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. City: Amsterdam, year: 2015, publisher: Elsevier Science;Morgan Kaufmann, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Heterogeneous Computing with OpenCL 2.0
Author:
Kaeli David R / Mistry Perhaad / Schaa Dana / Zhang Dong Ping
Publisher:
Elsevier Science;Morgan Kaufmann
Genre:
Books / Home and family
Year:
2015
City:
Amsterdam
Rating:
3 / 5
Favourites:
Add to favourites
Your mark:
- 60
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Heterogeneous Computing with OpenCL 2.0: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Heterogeneous Computing with OpenCL 2.0" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

This book teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated accelerated processing units (APUs) such as AMD fusion technology.

Kaeli David R.: author's other books

Who wrote Heterogeneous Computing with OpenCL 2.0? Find out the surname, the name of the author of the book and a list of all author's works by series.

Heterogeneous Computing with OpenCL 2.0 — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Heterogeneous Computing with OpenCL 2.0" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Heterogeneous Computing with OpenCL 2.0

Third Edition

David Kaeli

Perhaad Mistry

Dana Schaa

Dong Ping Zhang

Acquiring Editor: Todd Green

Editorial Project Manager: Charlie Kent

Project Manager: Priya Kumaraguruparan

Cover Designer: Matthew Limbert

Morgan Kaufmann is an imprint of Elsevier

225 Wyman Street, Waltham, MA 02451, USA

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publishers permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices

Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

ISBN: 978-0-12-801414-1

British Library Cataloguing in Publication Data

A catalogue record for this book is available from the British Library

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of Congress

For information on all MK publications visit our website at www.mkp.com

List of Figures Fig 12 Multiplying elements in arrays A and B and storing - photo 3

List of Figures

Fig. 1.2 Multiplying elements in arrays A and B, and storing the result in an array C.

Fig. 1.3 Task parallelism present in fast Fourier transform (FFT) application. Different input images are processed independently in the three independent tasks.

Fig. 1.4 Task-level parallelism, where multiple words can be compared concurrently. Also shown is finer-grained character-by-character parallelism present when characters within the words are compared with the search string.

Fig. 1.5 After all string comparisons in

Fig. 1.6 The relationship between parallel and concurrent programs. Parallel and concurrent programs are subsets of all programs.

Fig. 2.1 Out-of-order execution of an instruction stream of simple assembly-like instructions. Note that in this syntax, the destination register is listed first. For example, add a,b,c is a = b+c .

Fig. 2.2 VLIW execution based on the out-of-order diagram in

Fig. 2.3 SIMD execution where a single instruction is scheduled in order, but executes over multiple ALUs at the same time.

Fig. 2.4 The out-of-order schedule seen in

Fig. 2.5 Two threads scheduled in a time-slice fashion.

Fig. 2.6 Taking temporal multithreading to an extreme as is done in throughput computing: a large number of threads interleave execution to keep the device busy, whereas each individual thread takes longer to execute than the theoretical minimum.

Fig. 2.7 The AMD Puma (left) and Steamroller (right) high-level designs (not shown to any shared scale). Puma is a low-power design that follows a traditional approach to mapping functional units to cores. Steamroller combines two cores within a module, sharing its floating-point (FP) units.

Fig. 2.8 The AMD Radeon HD 6970 GPU architecture. The device is divided into two halves, where instruction control (scheduling and dispatch) is performed by the wave scheduler for each half. The 24 16-lane SIMD cores execute four-way VLIW instructions on each SIMD lane and contain private level 1 (L1) caches and local data shares (scratchpad memory).

Fig. 2.9 The Niagara 2 CPU from Sun/Oracle. The design intends to make a high level of threading efficient. Note its relative similarity to the GPU design seen in

Fig. 2.10 The AMD Radeon R9 290X architecture. The device has 44 cores in 11 clusters. Each core consists of a scalar execution unit that handles branches and basic integer operations, and four 16-lane SIMD ALUs. The clusters share instruction and scalar caches.

Fig. 2.11 The NVIDIA GeForce GTX 780 architecture. The device has 12 large cores that NVIDIA refers to as streaming multiprocessors (SMX). Each SMX has 12 SIMD units (with specialized double-precision and special function units), a single L1 cache, and a read-only data cache.

Fig. 2.12 The A10-7850K APU consists of two Steamroller-based CPU cores and eight Radeon R9 GPU cores (32 16-lane SIMD units in total). The APU includes a fast bus from the GPU to DDR3 memory, and a shared path that is optionally coherent with CPU caches.

Fig. 2.13 An Intel i7 processor with HD Graphics 4000 graphics. Although not termed APU by Intel, the concept is the same as for the devices in that category from AMD. Intel combines four Haswell x86 cores with its graphics processors, connected to a shared last-level cache (LLC) via a ring bus.

Fig. 3.1 An OpenCL platform with multiple compute devices. Each compute device contains one or more compute units. A compute unit is composed of one or more processing elements (PEs). A system could have multiple platforms present at the same time. For example, a system could have an AMD platform and an Intel platform present at the same time.

Fig. 3.2 Some of the Output from the CLInfo program showing the characteristics of an OpenCL platform and devices. We see that the AMD platform has two devices (a CPU and a GPU). The output shown here can be queried using functions from the platform API.

Fig. 3.3 Vector addition algorithm showing how each element can be added independently.

Fig. 3.4 The hierarchical model used for creating an NDRange of work-items, grouped into work-groups.

Fig. 3.5 The OpenCL runtime shown denotes an OpenCL context with two compute devices (a CPU device and a GPU device). Each compute device has its own command-queues. Host-side and device-side command-queues are shown. The device-side queues are visible only from kernels executing on the compute device. The memory objects have been defined within the memory model.

Fig. 3.6 Memory regions and their scope in the OpenCL memory model.

Fig. 3.7 Mapping the OpenCL memory model to an AMD Radeon HD 7970 GPU.

Fig. 4.1 A histogram generated from a 256-bit image. Each bin corresponds to the frequency of the corresponding pixel value.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Heterogeneous Computing with OpenCL 2.0»

Look at similar books to Heterogeneous Computing with OpenCL 2.0. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Robert Robey

Parallel and High Performance Computing

Hwu Wen-mei

Programming Massively Parallel Processors

Bandyopadhyay

Hands-On GPU Computing with Python

Avimanyu Bandyopadhyay

Hands-On GPU Computing with Python: Explore the capabilities of GPUs for solving high performance computational problems

Jaegeun Han

Learn CUDA Programming: A beginners guide to GPU programming and parallel computing with CUDA 10.x and C/C++

Vivek Kale

Parallel Computing Architectures and APIs: IoT Big Data Stream Processing

Kirk David B. Hwu Wen-mei W.

Programming Massively Parallel Processors,

Mistry Perhaad Kaeli David R. Howes Lee Schaa Dana

Heterogeneous Computing with OpenCL,

Raymond Tay

OpenCL Parallel Programming Development Cookbook

Benedict Gaster

Heterogeneous Computing with OpenCL: Revised OpenCL 1.2

David B. Kirk

Programming massively parallel processors: A hands-on approach

Aaftab Munshi

OpenCL Programming Guide

Reviews about «Heterogeneous Computing with OpenCL 2.0»

Discussion, reviews of the book Heterogeneous Computing with OpenCL 2.0 and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.