High Performance Parallelism Pearls Volume Two
eBook - ePub

High Performance Parallelism Pearls Volume Two

Multicore and Many-core Programming Approaches

  1. 592 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

High Performance Parallelism Pearls Volume Two

Multicore and Many-core Programming Approaches

About this book

High Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming – illustrating the most effective ways to combine Xeon Phi coprocessors with Xeon and other multicore processors. The book includes examples of successful programming efforts, drawn from across industries and domains such as biomed, genetics, finance, manufacturing, imaging, and more. Each chapter in this edited work includes detailed explanations of the programming techniques used, while showing high performance results on both Intel Xeon Phi coprocessors and multicore processors. Learn from dozens of new examples and case studies illustrating "success stories" demonstrating not just the features of Xeon-powered systems, but also how to leverage parallelism across these heterogeneous systems. - Promotes write-once, run-anywhere coding, showing how to code for high performance on multicore processors and Xeon Phi - Examples from multiple vertical domains illustrating real-world use of Xeon Phi coprocessors - Source code available for download to facilitate further exploration

Tools to learn more effectively

Saving Books

Saving Books

Keyword Search

Keyword Search

Annotating Text

Annotating Text

Listen to it instead

Listen to it instead

Information

Chapter 1

Introduction

James Reinders; Jim Jeffers Intel Corporation, USA

Abstract

This chapter introduces this book written by 73 experts sharing real-world examples and techniques that led to high-performance applications on multicore and many-core system. All chapters reference actual code and the modifications made to the code to improve performance. Codes discussed are freely available for download (http://lotsofcores.com). All the figures and diagrams from the book are available freely as well, to help facilitate teaching of parallel programming.
Keywords
512-bit SIMD
AVX-512
Coarse-grain
Embree
MPI
MPI shared memory
OSPRay
OpenCL
OpenMP
Python
SIMD
TBB
Xeon Phi
Heterogeneous
Hybrid parallelism
In-order
Latency optimizations
Many-core
Multicore
Nested parallelism
New era in programming
Offloading
Out-of-order
Power savings
Prefetching
pyMIC
Reserved core
Stream programming
Thread-safe
Vectorization
It has become well known that programming for the Intel Xeon Phi coprocessor heightens awareness of the need for scaling, vectorization, and increasing temporal locality of reference—exactly the keys to parallel programming. Once these keys to effective parallel programming are addressed, the result is a parallel program that makes IntelĀ® Xeon Phiā„¢ coprocessors and multicore processors optimized for higher performance. That represents a highly compelling preservation of investment when focus is given to modifying code in a portable and performance portable manner. Unsurprisingly, that means the chapters in this book use C, C++, and Fortran with standard parallel programming models including OpenMP, MPI, TBB, and OpenCL. We see that optimizations improve applications both on processors, such as IntelĀ® XeonĀ® processors, and Intel Xeon Phi products.
We are not supposed to have a favorite chapter, especially since 73 amazing experts contributed to this second Pearls book. They share compelling lessons in effective parallel programming through both specific application optimizations, and their illustration of key techniques…and we can learn from every one of them. However, we cannot avoid feeling a little like the characters on Big Bang Theory (a popular television show) who get excited by the mere mention of Stephen Hawking. Now, to be very clear, Stephen Hawking did not work on this book. At least, not to our knowledge.

Applications and techniques

The programming topics that receive the most discussion in this book are OpenMP and vectorization, followed closely by MPI. However, there are many more topics which get serious attention including nested parallelism, latency optimizations, prefetching, Python, OpenCL, offloading, stream programming, making code thread-safe, and power savings.
This book does not have distinct sections, but you will find that the first half of the book consists of chapters that dive deeply into optimizing a single application and dealing with the work that is needed to optimize that application for parallelism. The second half of the book switches to chapters that dive into a technique or approach, and illustrate it with a number of different examples.
In all the chapters, the examples were selected for their educational content, applicability, and success. You can download codes and try them yourself! Examples demonstrate successful approaches to parallel programming that have application with both processors and coprocessors. Not all the examples scale well enough to make an Intel Xeon Phi coprocessor run faster than a processor. This is reality we all face in programming, and it reinforces something we should never be bashful in pointing out: a common programming model matters a great deal. The programming is not making a choice on what will run better; it focuses on parallel programming and can use either multicore or many-core products. The techniques utilized almost always apply to both processors and coprocessors. Some chapters utilize nonportable techniques and explain why. The most common use of nonportable programming you will see in this book is focused targeting to 512-bit SIMD, a feature that arrived in Intel Xeon Phi coprocessors before appearing in processors. The strong benefits of common programming emerge over and over in real-life examples, including those in this book.

SIMD and vectorization

Many chapters make code changes in their applications to utilize SIMD capabilities of processors and coprocessors, including Chapters 2–4, 8. There are three additional vectorization focused chapters tackling key techniques or tools that you may find indispensible. The concept of an SIMD function is covered in Chapter 22. SIMD functions allow a program written to operate on scalar (one at a time) data to be vectorized by the appropriate use of OpenMP SIMD directives. A tool to help analyze your vectorization opportunities and give advice is the subject of Chapter 23. An increasingly popular library approach to parallel vector programming, called OpenVec, is covered in Chapter 24.
We do have a really cool chapter that begins with ā€œThe best current explanation of how our universe began is with a period of rapid exponential expansion, termed inflation. This created the large, mostly empty, universe that we observe today. The principle piece of evidence for this comes from……the Cosmic Microwave Background (CMB), a microwave frequency background radiation, thought to have been left over from the big bangā€¦ā€
Who would not be excited by that?
In an attempt to avoid accusations that we have a favorite chapter…… we buried ā€œCosmic Microwave Background Analysis: Nested Parallelism In Practiceā€ in the middle of the book so it is as far as possible from the cover which features the Cosmos supercomputer that theoretical physicists at the University of Cambridge use. The same book cover that has an OSPRay rendered visualization from the Modal program that they optimize in their chapter (and yes, they do work with Dr. Hawking – but we still are not saying he actually worked on the book!).

OpenMP and nested parallelism

Many chapters make code changes in their applications to harness task or thread level parallelism with OpenMP. Chapter 17 drives home the meaning and value of being more ā€œcoarse-grainedā€ in order to scale well. The challenges of making legacy code thread-safe are discussed in some detail in Chapter 5, including discussions of choices that did not work.
Two chapters advocate nested parallelism in OpenMP, and use it to get significant performance gains: Chapters 10 and 18. Exploiting multilevel parallelism deserves consideration even if rejected in the past. OpenMP nesting is turned off by default by most implementations, and is generally consider unsafe by typical user...

Table of contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Contributors
  6. Acknowledgments
  7. Foreword
  8. Preface
  9. Chapter 1: Introduction
  10. Chapter 2: Numerical Weather Prediction Optimization
  11. Chapter 3: WRF Goddard Microphysics Scheme Optimization
  12. Chapter 4: Pairwise DNA Sequence Alignment Optimization
  13. Chapter 5: Accelerated Structural Bioinformatics for Drug Discovery
  14. Chapter 6: Amber PME Molecular Dynamics Optimization
  15. Chapter 7: Low-Latency Solutions for Financial Services Applications
  16. Chapter 8: Parallel Numerical Methods in Finance
  17. Chapter 9: Wilson Dslash Kernel From Lattice QCD Optimization
  18. Chapter 10: Cosmic Microwave Background Analysis: Nested Parallelism in Practice
  19. Chapter 11: Visual Search Optimization
  20. Chapter 12: Radio Frequency Ray Tracing
  21. Chapter 13: Exploring Use of the Reserved Core
  22. Chapter 14: High Performance Python Offloading
  23. Chapter 15: Fast Matrix Computations on Heterogeneous Streams
  24. Chapter 16: MPI-3 Shared Memory Programming Introduction
  25. Chapter 17: Coarse-Grained OpenMP for Scalable Hybrid Parallelism
  26. Chapter 18: Exploiting Multilevel Parallelism in Quantum Simulations
  27. Chapter 19: OpenCL: There and Back Again
  28. Chapter 20: OpenMP Versus OpenCL: Difference in Performance?
  29. Chapter 21: Prefetch Tuning Optimizations
  30. Chapter 22: SIMD Functions Via OpenMP
  31. Chapter 23: Vectorization Advice
  32. Chapter 24: Portable Explicit Vectorization Intrinsics
  33. Chapter 25: Power Analysis for Applications and Data Centers
  34. Author Index
  35. Subject Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access High Performance Parallelism Pearls Volume Two by Jim Jeffers,James Reinders in PDF and/or ePUB format, as well as other popular books in Computer Science & Programming. We have over one million books available in our catalogue for you to explore.