Heterogeneous Computing with OpenCL
eBook - ePub

Heterogeneous Computing with OpenCL

Revised OpenCL 1.2 Edition

Benedict Gaster, Lee Howes, David R. Kaeli, Perhaad Mistry, Dana Schaa

Partager le livre
  1. 308 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Heterogeneous Computing with OpenCL

Revised OpenCL 1.2 Edition

Benedict Gaster, Lee Howes, David R. Kaeli, Perhaad Mistry, Dana Schaa

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

Heterogeneous Computing with OpenCL, Second Edition teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units (APUs) such as AMD Fusion technology. It is the first textbook that presents OpenCL programming appropriate for the classroom and is intended to support a parallel programming course. Students will come away from this text with hands-on experience and significant knowledge of the syntax and use of OpenCL to address a range of fundamental parallel algorithms.

Designed to work on multiple platforms and with wide industry support, OpenCL will help you more effectively program for a heterogeneous future. Written by leaders in the parallel computing and OpenCL communities, Heterogeneous Computing with OpenCL explores memory spaces, optimization techniques, graphics interoperability, extensions, and debugging and profiling. It includes detailed examples throughout, plus additional online exercises and other supporting materials that can be downloaded at http://www.heterogeneouscompute.org/?page_id=7

This book will appeal to software engineers, programmers, hardware engineers, and students/advanced students.

  • Explains principles and strategies to learn parallel programming with OpenCL, from understanding the four abstraction models to thoroughly testing and debugging complete applications.
  • Covers image processing, web plugins, particle simulations, video editing, performance optimization, and more.
  • Shows how OpenCL maps to an example target architecture and explains some of the tradeoffs associated with mapping to various architectures
  • Addresses a range of fundamental programming techniques, with multiple examples and case studies that demonstrate OpenCL extensions for a variety of hardware platforms

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Heterogeneous Computing with OpenCL est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Heterogeneous Computing with OpenCL par Benedict Gaster, Lee Howes, David R. Kaeli, Perhaad Mistry, Dana Schaa en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Computer Science et Systems Architecture. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Éditeur
Morgan Kaufmann
Année
2012
ISBN
9780124055209
Édition
2

Foreword to the Revised OpenCL 1.2 Edition

I need your help. I need you to read this book and start using OpenCL. Let me explain.
The fundamental building blocks of computing have changed over the past 10 years. We have moved from the single-core processors many of us started with long ago to shared memory multicore processors, to highly scalable “many core” processors, and finally to heterogeneous platforms (e.g., a combination of a CPU and a GPU). If you have picked up this book and are thinking of reading it, you are most likely well aware of this fact. I assume you are also aware that software needs to change to keep up with the evolution of hardware.
And this is where I need your help. I’ve been working in parallel computing since 1985 and have used just about every class of parallel computer. I’ve used more parallel programming environments than I could name and have helped create more than a few. So I know how this game works. Hardware changes and programmers like us are forced to respond. Our old code breaks and we have to reengineer our software. It’s painful, but it’s a fact of life.
Money makes the world go around so hardware vendors fight for competitive advantage. This drives innovation and, over the long run, is a good thing. To build attention for “their” platforms, however, these vendors “help” the poor programmers by creating new programming models tied to their hardware. And this breeds confusion. Well-meaning but misguided people use, or are forced to use, these new programming models and the software landscape fragments. With different programming models for each platform, the joy of creating new software is replaced with tedious hours reworking our software for each and every new platform that comes along.
At certain points in the history of parallel computing, as the software landscape continues to fragment, a subset of people come together and fight back. This requires a rare combination of a powerful customer that controls a lot of money, a collection of vendors eager to please that customer, and big ideas to solve the programming challenges presented by a new class of hardware. This rare set of circumstances can take years to emerge, so when it happens, you need to jump on the opportunity. It happened for clusters and massively parallel supercomputers with MPI (1994). It happened for shared memory computers with OpenMP (1997). And more recently, this magical combination of factors has come together for heterogeneous computing to give us OpenCL.
I can’t stress how important this development is. If OpenCL fails to dominate the heterogeneous computing niche, it could be many years before the right set of circumstances come together again. If we let this opportunity slip away and we fall back on our old, proprietary programming model ways, we could be sentencing our software developers to years of drudgery.
So I need your help. I need you to join the OpenCL revolution. I need you to insist on portable software frameworks for heterogeneous platforms. When possible, avoid programming models tied to a single hardware vendor’s products. Open standards help everyone. They enable more than a product line. They enable an industry, and if you are in the software business, that is a very good thing.
OpenCL, however, is an unusually complex parallel programming standard. It has to be. I am aware of no other parallel programming model that addresses such a wide array of systems: GPUs, CPUs, FPGAs, embedded processors, and combinations of these systems. OpenCL is also complicated by the goals of its creators. You see, in creating OpenCL, we decided the best way to impact the industry would be to create a programming model for the performance-oriented programmer wanting full access to the details of the system. Our reasoning was that, over time, high-level models would be created to map onto OpenCL. By creating a common low-level target for these higher level models, we’d enable a rich marketplace of ideas and programmers would win. OpenCL, therefore, doesn’t give you many abstractions to make your programming job easier. You have to do all that work yourself.
OpenCL can be challenging, which is where this book comes in. You can learn OpenCL by downloading the specification and writing code. That is a difficult way to go. It is much better to have trailblazers who have gone before you establish the context and then walk you through the key features of the standard. Programmers learn by example, and this book uses that fact by providing a progression of examples from trivial (vector addition) to complex (image analysis). This book will help you establish a firm foundation that you can build on as you master this exciting new programming model.
Read this book. Write OpenCL code. Join the revolution. Help us make the world safe for heterogeneous computing. Please 
 I need your help. We all do.
Tim Mattson
Principal Engineer, Intel Corp.

Foreword to the First Edition

For more than two decades, the computer industry has been inspired and motivated by the observation made by Gordon Moore (A.K.A “Moore’s law”) that the density of transistors on die was doubling every 18 months. This observation created the anticipation that the performance a certain application achieves on one generation of processors will be doubled within two years when the next generation of processors will be announced. Constant improvement in manufacturing and processor technologies was the main drive of this trend since it allowed any new processor generation to shrink all the transistor’s dimensions within the “golden factor”, 0.3 (ideal shrink) and to reduce the power supply accordingly. Thus, any new processor generation could double the density of transistors, to gain 50% speed improvement (frequency) while consuming the same power and keeping the same power density. When better performance was required, computer architects were focused on using the extra transistors for pushing the frequency beyond what the shrink provided, and for adding new architectural features that mainly aim at gaining performance improvement for existing and new applications.
During the mid 2000s, the transistor size became so small that the “physics of small devices” started to govern the characterization of the entire chip. Thus frequency improvement and density increase could not be achieved anymore without a significant increase of power consumption and of power density. A recent report by the International Technology Roadmap for Semiconductors (ITRS) supports this observation and indicates that this trend will continue for the foreseeable future and it will most likely become the most significant factor affecting technology scaling and the future of computer based system.
To cope with the expectation of doubling the performance every known period of time (not 2 years anymore), two major changes happened (1) instead of increasing the frequency, modern processors increase the number of cores on each die. This trend forces the software to be changed as well. Since we cannot expect the hardware to achieve significantly better performance for a given application anymore, we need to develop new implementations for the same application that will take advantage of the multicore architecture, and (2) thermal and power become first class citizens with any design of future architecture. These trends encourage the community to start looking at heterogeneous solutions: systems which are assembled from different subsystems, each of them optimized to achieve different optimization points or to address different workloads. For example, many systems combine “traditional” CPU architecture with special purpose FPGAs or Graphics Processors (GPUs). Such an integration can be done at different levels; e.g., at the system level, at the board level and recently at the core level.
Developing software for homogeneous parallel and distributed systems is considered to be a non-trivial task, even though such development uses well-known paradigms and well established programming languages, developing methods, algorithms, debugging tools, etc. Developing software to support general-purpose heterogeneous systems is relatively new and so less mature and much more difficult. As heterogeneous systems are becoming unavoidable, many of the major software and hardware manufacturers start creating software environments to support them.AMD proposed the use of the Brook language developed in Stanford University, to handle streaming computations, later extending the SW environment to include the Close to Metal (CTM)and the Compute Abstraction Layer (CAL) for accessing their low level streaming hardware primitives in order to take advantage of their highly threaded parallel architecture. NVIDIA took a similar approach, co-designing their recent generations of GPUs and the CUDA programming environment to take advantage of the highly threaded GPU environment. Intel proposed to extend the use of multi-core programming to program their Larrabee architecture. IBM proposed the use of message-passing-based software in order to take advantage of its heterogeneous, non-coherent cell architecture and FPGA based solutions integrate libraries written in VHDL with C or C++ based programs to achieve the best of two environments. Each of these programming environments offers scope for benefiting domain-specific applications, but they all failed to address the requirement for general purpose software that can serve different hardware architectures in the way that, for example, Java code can run on very different ISA architectures.
The Open Computing Language (OpenCL) was designed to meet this important need. It was defined and managed by the nonprofit technology consortium Khronos The language and its development environment “borrows” many of its basic concepts from very successful, hardware specific environments such as CUDA, CAL, CTM, and blends them to create a hardware independent software development environment. It supports different levels of parallelism and efficiently maps to homogeneous or heterogeneous, single- or multiple-device systems consisting of CPUs, GPUs, FPGA and potentially other future devices. In order to support future devices, OpenCL defines a set of mechanisms that if met, the device could be seamlessly included as part of the OpenCL environment. OpenCL also defines a run-time support that allows to manage the resources, combine different types of hardware under the same execution environment and hopefully in the future it will allow to dynamically balance computations, power and other resources such as memory hierarchy, in a more general manner.
This book is a text book that aims to teach students how to program heterogeneous environments. The book starts with a very important discussion on how to program parallel systems and defines the concepts the students need to understand before starting to program any heterogeneous system. It also provides a taxonomy that can be used for understanding the different models used for parallel and distributed systems. Chapters 2–4 build the students’ step by step understanding of the basic structures of OpenCL (Chapter 2) including the host and the device architecture (Chapter 3). Chapter 4 provides an example that puts together these concepts using a not trivial example.
Chapters 5 and 6 extend the concepts we learned so far with a better understanding of the notions of concurrency and run-time execution in OpenCL (Chapter 5) and the dissection between the CPU and the GPU (Chapter 6). After building the basics, the book dedicates 4 Chapters (7-10) to more sophisticated examples. These sections are vital for students to understand that OpenCL can be used for a wide range of applications which are beyond any domain specific mode of operation. The book also demonstrates how the same program can be run on different platforms, such as Nvidia or AMD. The book ends with three chapters which are dedicated to advanced topics.
No doubt that this is a very important book that provides students and researchers with a better understanding of the world of heterogeneous computers in general and the solutions provided by OpenCL in particular. The book is well written, fits students’ different experience levels and so, can be used either as a text book in a course on OpenCL, or different parts of the book can be used to extend other courses; e.g., the first two chapters are well fitted for a course on parallel programming and some of the examples can be used as a part of advanced courses.
Dr. Avi Mendelson
Microsoft R&D Israel, Adjunct Professor, Technion

Preface

Our Heterogeneous World

Our world is heterogeneous in nature. This kind of diversity provides a richness and detail that is difficult to describe. At the same time, it provides a level of complexity and interaction in which a wide range of different entities are optimized for specific tasks and environments.
In computing, heterogeneous computer systems also add richness by allowing the programmer to select the best architecture to execute the task at hand or to choose the right task to make optimal use of a given architecture. These two views of the flexibility of a heterogeneous system both become apparent when solving a computational problem involves a variety of different tasks. Recently, there has been an upsurge in the computer design community experimenting with building heterogeneous systems. We are seeing new systems on the market that combine a number of different classes of architectures. What has slowed this progression has been a lack of standardized programming environment that can manage the diverse set of resources in a common framework.

OpenCL

OpenCL has been developed specifically to ease the programming burden when writing applications for heterogeneous systems. OpenCL also addresses the current trend to increase the number of cores on a given architecture. The OpenCL framework supports execution on multi-core central processing units, digital signal processors, field programmable gate arrays, graphics processing units, and heterogeneous accelerated processing units. The architectures already supported cover a wide range of approaches to extracting parallelism and efficiency from memory systems and instruction streams. Such diversity in architectures allows the designer to provide an optimized solution to his or her problem—a solution that, if designed within the OpenCL specification, can scale with the growth and breadth of available architectures. OpenCL’s standard abstractions and interfaces allow the programmer to seamlessly “stitch” together an application within which execution can occur on a rich set of heterogeneous devices from one or many manufacturers.

This Text

Until now, there has not been a single definitive text that can help programmers and software engineers leverage the power and flexibility of the OpenCL programming standard. This is our attempt to address this void. With this goal in mind, we have not attempted to create a syntax guide—there are numerous good sources in which programmers can find a complete and up-to-date description of OpenCL syntax. Instead, this text is an attempt to show a developer or student how to leverage the OpenCL framework to build interesting and useful applications. We provide a number of examples of real applications to demonstrate the power of this programming standard.
Our hope is that the reader will embrace this new programming framework and explore the full benefits of heterogeneous computing that it provides. We welcome comments on how to improve upon this text, and we hope that this text will help you build your next heterogeneous application.

Acknowledgments

We thank Manju Hegde for proposing the book project, Jay Owen for connecting the participants on this project with each other and finally Todd Green from Morgan Kaufmann for his project management, input and deadline pressure.
On the technical side, we thank Jay Cornwall for his thorough work editing an early version of this text, and we thank Takahiro Harada, Justin Hensley, Budirijanto Purnomo, Frank Swehosky and Dongping Zhang for their significant contributions to individual chapters, particularly the sequence of case studies that could not have been produced without their help.

About the Authors

Benedict R. Gaster is a software architect working on programming models for next-generation heterogeneous processors, particularl...

Table des matiĂšres