eBook - ePub

Parallel and High Performance Computing

Name: Parallel and High Performance Computing
Author: Robert Robey, Yuliana Zamora

Robert Robey, Yuliana Zamora

Share book

704 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Parallel and High Performance Computing

Robert Robey, Yuliana Zamora

Book details

Book preview

Table of contents

Citations

About This Book

Parallel and High Performance Computing offers techniques guaranteed to boost your code's effectiveness. Summary
Complex calculations, like training deep learning models or running large-scale simulations, can take an extremely long time. Efficient parallel programming can save hours—or even days—of computing time. Parallel and High Performance Computing shows you how to deliver faster run-times, greater scalability, and increased energy efficiency to your programs by mastering parallel techniques for multicore processor and GPU hardware. About the technology
Write fast, powerful, energy efficient programs that scale to tackle huge volumes of data. Using parallel programming, your code spreads data processing tasks across multiple CPUs for radically better performance. With a little help, you can create software that maximizes both speed and efficiency. About the book
Parallel and High Performance Computing offers techniques guaranteed to boost your code's effectiveness. You'll learn to evaluate hardware architectures and work with industry standard tools such as OpenMP and MPI. You'll master the data structures and algorithms best suited for high performance computing and learn techniques that save energy on handheld devices. You'll even run a massive tsunami simulation across a bank of GPUs. What's inside Planning a new parallel project
Understanding differences in CPU and GPU architecture
Addressing underperforming kernels and loops
Managing applications with batch scheduling About the reader
For experienced programmers proficient with a high-performance computing language like C, C++, or Fortran. About the author
Robert Robey works at Los Alamos National Laboratory and has been active in the field of parallel computing for over 30 years. Yuliana Zamora is currently a PhD student and Siebel Scholar at the University of Chicago, and has lectured on programming modern hardware at numerous national conferences. Table of Contents
PART 1 INTRODUCTION TO PARALLEL COMPUTING
1 Why parallel computing?
2 Planning for parallelization
3 Performance limits and profiling
4 Data design and performance models
5 Parallel algorithms and patterns
PART 2 CPU: THE PARALLEL WORKHORSE
6 Vectorization: FLOPs for free
7 OpenMP that performs
8 MPI: The parallel backbone
PART 3 GPUS: BUILT TO ACCELERATE
9 GPU architectures and concepts
10 GPU programming model
11 Directive-based GPU programming
12 GPU languages: Getting down to basics
13 GPU profiling and tools
PART 4 HIGH PERFORMANCE COMPUTING ECOSYSTEMS
14 Affinity: Truce with the kernel
15 Batch schedulers: Bringing order to chaos
16 File operations for a parallel world
17 Tools and resources for better code

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Parallel and High Performance Computing an online PDF/ePUB?

Yes, you can access Parallel and High Performance Computing by Robert Robey, Yuliana Zamora in PDF and/or ePUB format, as well as other popular books in Informatik & Programmierung in C. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Manning

Year

2021

ISBN

9781638350385

Topic

Informatik

Subtopic

Programmierung in C

Part 1 Introduction to parallel computing

The first part of this book covers topics of general importance to parallel computing. These topics include

Understanding the resources in a parallel computer
Estimating the performance and speedup of applications
Looking at software engineering needs particular to parallel computing
Considering choices for data structures
Selecting algorithms that perform and parallelize well

While these topics should be considered first by a parallel programmer, these will not have the same importance to all readers of this book. For the parallel application developer, all of the chapters in this part address upfront concerns for a successful project. A project needs to select the right hardware, the right type of parallelism, and the right kind of expectations. You should determine the appropriate data structures and algorithms before starting your parallelization efforts; it’s much harder to change these later.

Even if you are a parallel application developer, you may not need the full depth of material discussed. Those desiring only modest parallelism or serving a particular role on a team of developers might find a cursory understanding of the content sufficient. If you just want to explore parallel computing, we suggest reading chapter 1 and chapter 5, then skimming the others to get the terminology that is used in discussing parallel computing.

We include chapter 2 for those who may not have a software engineering background or for those who just need a refresher. If you are new to all of the details of CPU hardware, then you may need to read chapter 3 in small increments. An understanding of the current computing hardware and your application is important in extracting performance, but it doesn’t have to come all at once. Be sure to return to chapter 3 when you are ready to purchase your next computing system so you can cut through all the marketing claims to what is really important for your application.

The discussion of data design and performance modeling in chapter 4 can be challenging because it requires an understanding of hardware details, their performance, and compilers to fully appreciate. Although it’s an important topic due to the impact the cache and compiler optimizations have on performance, it’s not necessary for writing a simple parallel program.

We encourage you to follow along with the accompanying examples for the book. You should spend some time exploring the many examples that are available in these software repositories at https://github.com/EssentialsOfParallelComputing.

The examples are organized by chapter and include detailed information for setup on various hardware and operating systems. For helping to deal with portability issues, there are sample container builds for Ubuntu distributions in Docker. There are also instructions for setting up a virtual machine through VirtualBox. If you have a need for setting your own system up, you may want to read the section on Docker and virtual machines in chapter 13. But containers and virtual machines come with restricted environments that are not easy to work around.

Our work is ongoing for the container builds and other system environment setups to work properly for the many possible system configurations. Getting the system software installed correctly, especially the GPU driver and associated software, is the most challenging part of the journey. The wide variety of operating systems, hardware including graphics processing units (GPUs), and the often overlooked quality of installation software makes this a difficult task. One alternative is to use a cluster where the software is already installed. Still, it is helpful at some point to get some software installed on your laptop or desktop for a more convenient development resource. Now it is time to turn the page and enter the world of parallel computing. It is a world of nearly unlimited performance and potential.

1 Why parallel computing?

This chapter covers

What parallel computing is and why it’s growing in importance
Where parallelism exists in modern hardware
Why the amount of application parallelism is important
Software approaches to exploit parallelism

In today’s world, you’ll find many challenges requiring extensive and efficient use of computing resources. Most of the applications requiring performance traditionally are in the scientific domain. But artificial intelligence (AI) and machine learning applications are projected to become the predominant users of large-scale computing. Some examples of these applications include

Modeling megafires to assist fire crews and to help the public
Modeling tsunamis and storm surges from hurricanes (see chapter 13 for a simple tsunami model)
Voice recognition for computer interfaces
Modeling virus spread and vaccine development
Modeling climatic conditions over decades and centuries
Image recognition for driverless car technology
Equipping emergency crews with running simulations of hazards such as flooding
Reducing power consumption for mobile devices

With the techniques covered in this book, you will be able to handle larger problems and datasets, while also running simulations ten, a hundred, or even a thousand times faster. Typical applications leave much of the compute capability of today’s computers untapped. Parallel computing is the key to unlocking the potential of your computer resources. So what is parallel computing and how can you use it to supercharge your applications?

Parallel computing is the execution of many operations at a single instance in time. Fully exploiting parallel computing does not happen automatically. It requires some effort from the programmer. First, you must identify and expose the potential for parallelism in an application. Potential parallelism, or concurrency, means that you certify that it is safe to conduct operations in any order as the system resources become available. And, with parallel computing, there is an additional requirement: these operations must occur at the same time. For this to happen, you must also properly leverage the resources to execute these operations simultaneously.

Parallel computing introduces new concerns that are not present in a serial world. We need to change our thought processes to adapt to the additional complexities of parallel execution, but with practice, this becomes second nature. This book begins your discovery in how to access the power of parallel computing.

Life presents numerous examples of parallel processing, and these instances often become the basis for computing strategies. Figure 1.1 shows a supermarket checkout line, where the goal is to have customers quickly pay for the items they want to purchase. This can be done by employing multiple cashiers to process, or check out, the customers one at a time. In this case, the skilled cashiers can more quickly execute the checkout process so customers leave faster. Another strategy is to employ many self-checkout stations and allow customers to execute the process on their own. This strategy requires fewer human resources from the supermarket and can open more lanes to process customers. Customers may not be able to check themselves out as efficiently as a trained cashier, but perhaps more customers can check out quickly due to increased parallelism resulting in shorter lines.

We solve computational problems by developing algorithms: a set of steps to achieve a desired result. In the supermarket analogy, the process of checking out is the algorithm. In this case, it includes unloading items from a basket, scanning the items to obtain a price, and paying for the items. This algorithm is sequential (or serial); it must follow this order. If there are hundreds of customers that need to execute this task, the algorithm for checking out many customers contains a parallelism that can be taken advantage of. Theoretically, there is no dependency between any two customers going through the checkout process. By using multiple checkout lines or self-checkout stations, supermarkets expose parallelism, thereby increasing the rate at which customers buy goods and leave the store. Each choice in how we implement this parallelism results in different costs and benefits.

Figure 1.1 Everyday parallelism in supermarket checkout queues. The checkout cashiers (with caps) process their queue of customers (with baskets). On the left, one cashier processes four self...