Parallel and High Performance Computing
eBook - ePub

Parallel and High Performance Computing

Robert Robey, Yuliana Zamora

  1. 704 páginas
  2. English
  3. ePUB (apto para móviles)
  4. Disponible en iOS y Android
eBook - ePub

Parallel and High Performance Computing

Robert Robey, Yuliana Zamora

Detalles del libro
Vista previa del libro

Información del libro

Parallel and High Performance Computing offers techniques guaranteed to boost your code's effectiveness. Summary
Complex calculations, like training deep learning models or running large-scale simulations, can take an extremely long time. Efficient parallel programming can save hours—or even days—of computing time. Parallel and High Performance Computing shows you how to deliver faster run-times, greater scalability, and increased energy efficiency to your programs by mastering parallel techniques for multicore processor and GPU hardware. About the technology
Write fast, powerful, energy efficient programs that scale to tackle huge volumes of data. Using parallel programming, your code spreads data processing tasks across multiple CPUs for radically better performance. With a little help, you can create software that maximizes both speed and efficiency. About the book
Parallel and High Performance Computing offers techniques guaranteed to boost your code's effectiveness. You'll learn to evaluate hardware architectures and work with industry standard tools such as OpenMP and MPI. You'll master the data structures and algorithms best suited for high performance computing and learn techniques that save energy on handheld devices. You'll even run a massive tsunami simulation across a bank of GPUs. What's inside Planning a new parallel project
Understanding differences in CPU and GPU architecture
Addressing underperforming kernels and loops
Managing applications with batch scheduling About the reader
For experienced programmers proficient with a high-performance computing language like C, C++, or Fortran. About the author
Robert Robey works at Los Alamos National Laboratory and has been active in the field of parallel computing for over 30 years. Yuliana Zamora is currently a PhD student and Siebel Scholar at the University of Chicago, and has lectured on programming modern hardware at numerous national conferences. Table of Contents
1 Why parallel computing?
2 Planning for parallelization
3 Performance limits and profiling
4 Data design and performance models
5 Parallel algorithms and patterns
6 Vectorization: FLOPs for free
7 OpenMP that performs
8 MPI: The parallel backbone
9 GPU architectures and concepts
10 GPU programming model
11 Directive-based GPU programming
12 GPU languages: Getting down to basics
13 GPU profiling and tools
14 Affinity: Truce with the kernel
15 Batch schedulers: Bringing order to chaos
16 File operations for a parallel world
17 Tools and resources for better code

Preguntas frecuentes

¿Cómo cancelo mi suscripción?
Simplemente, dirígete a la sección ajustes de la cuenta y haz clic en «Cancelar suscripción». Así de sencillo. Después de cancelar tu suscripción, esta permanecerá activa el tiempo restante que hayas pagado. Obtén más información aquí.
¿Cómo descargo los libros?
Por el momento, todos nuestros libros ePub adaptables a dispositivos móviles se pueden descargar a través de la aplicación. La mayor parte de nuestros PDF también se puede descargar y ya estamos trabajando para que el resto también sea descargable. Obtén más información aquí.
¿En qué se diferencian los planes de precios?
Ambos planes te permiten acceder por completo a la biblioteca y a todas las funciones de Perlego. Las únicas diferencias son el precio y el período de suscripción: con el plan anual ahorrarás en torno a un 30 % en comparación con 12 meses de un plan mensual.
¿Qué es Perlego?
Somos un servicio de suscripción de libros de texto en línea que te permite acceder a toda una biblioteca en línea por menos de lo que cuesta un libro al mes. Con más de un millón de libros sobre más de 1000 categorías, ¡tenemos todo lo que necesitas! Obtén más información aquí.
¿Perlego ofrece la función de texto a voz?
Busca el símbolo de lectura en voz alta en tu próximo libro para ver si puedes escucharlo. La herramienta de lectura en voz alta lee el texto en voz alta por ti, resaltando el texto a medida que se lee. Puedes pausarla, acelerarla y ralentizarla. Obtén más información aquí.
¿Es Parallel and High Performance Computing un PDF/ePUB en línea?
Sí, puedes acceder a Parallel and High Performance Computing de Robert Robey, Yuliana Zamora en formato PDF o ePUB, así como a otros libros populares de Computer Science y Programming in C. Tenemos más de un millón de libros disponibles en nuestro catálogo para que explores.


Computer Science
Programming in C

Part 1 Introduction to parallel computing

The first part of this book covers topics of general importance to parallel computing. These topics include
  • Understanding the resources in a parallel computer
  • Estimating the performance and speedup of applications
  • Looking at software engineering needs particular to parallel computing
  • Considering choices for data structures
  • Selecting algorithms that perform and parallelize well
While these topics should be considered first by a parallel programmer, these will not have the same importance to all readers of this book. For the parallel application developer, all of the chapters in this part address upfront concerns for a successful project. A project needs to select the right hardware, the right type of parallelism, and the right kind of expectations. You should determine the appropriate data structures and algorithms before starting your parallelization efforts; it’s much harder to change these later.
Even if you are a parallel application developer, you may not need the full depth of material discussed. Those desiring only modest parallelism or serving a particular role on a team of developers might find a cursory understanding of the content sufficient. If you just want to explore parallel computing, we suggest reading chapter 1 and chapter 5, then skimming the others to get the terminology that is used in discussing parallel computing.
We include chapter 2 for those who may not have a software engineering background or for those who just need a refresher. If you are new to all of the details of CPU hardware, then you may need to read chapter 3 in small increments. An understanding of the current computing hardware and your application is important in extracting performance, but it doesn’t have to come all at once. Be sure to return to chapter 3 when you are ready to purchase your next computing system so you can cut through all the marketing claims to what is really important for your application.
The discussion of data design and performance modeling in chapter 4 can be challenging because it requires an understanding of hardware details, their performance, and compilers to fully appreciate. Although it’s an important topic due to the impact the cache and compiler optimizations have on performance, it’s not necessary for writing a simple parallel program.
We encourage you to follow along with the accompanying examples for the book. You should spend some time exploring the many examples that are available in these software repositories at
The examples are organized by chapter and include detailed information for setup on various hardware and operating systems. For helping to deal with portability issues, there are sample container builds for Ubuntu distributions in Docker. There are also instructions for setting up a virtual machine through VirtualBox. If you have a need for setting your own system up, you may want to read the section on Docker and virtual machines in chapter 13. But containers and virtual machines come with restricted environments that are not easy to work around.
Our work is ongoing for the container builds and other system environment setups to work properly for the many possible system configurations. Getting the system software installed correctly, especially the GPU driver and associated software, is the most challenging part of the journey. The wide variety of operating systems, hardware including graphics processing units (GPUs), and the often overlooked quality of installation software makes this a difficult task. One alternative is to use a cluster where the software is already installed. Still, it is helpful at some point to get some software installed on your laptop or desktop for a more convenient development resource. Now it is time to turn the page and enter the world of parallel computing. It is a world of nearly unlimited performance and potential.

1 Why parallel computing?

This chapter covers
  • What parallel computing is and why it’s growing in importance
  • Where parallelism exists in modern hardware
  • Why the amount of application parallelism is important
  • Software approaches to exploit parallelism
In today’s world, you’ll find many challenges requiring extensive and efficient use of computing resources. Most of the applications requiring performance traditionally are in the scientific domain. But artificial intelligence (AI) and machine learning applications are projected to become the predominant users of large-scale computing. Some examples of these applications include
  • Modeling megafires to assist fire crews and to help the public
  • Modeling tsunamis and storm surges from hurricanes (see chapter 13 for a simple tsunami model)
  • Voice recognition for computer interfaces
  • Modeling virus spread and vaccine development
  • Modeling climatic conditions over decades and centuries
  • Image recognition for driverless car technology
  • Equipping emergency crews with running simulations of hazards such as flooding
  • Reducing power consumption for mobile devices
With the techniques covered in this book, you will be able to handle larger problems and datasets, while also running simulations ten, a hundred, or even a thousand times faster. Typical applications leave much of the compute capability of today’s computers untapped. Parallel computing is the key to unlocking the potential of your computer resources. So what is parallel computing and how can you use it to supercharge your applications?
Parallel computing is the execution of many operations at a single instance in time. Fully exploiting parallel computing does not happen automatically. It requires some effort from the programmer. First, you must identify and expose the potential for parallelism in an application. Potential parallelism, or concurrency, means that you certify that it is safe to conduct operations in any order as the system resources become available. And, with parallel computing, there is an additional requirement: these operations must occur at the same time. For this to happen, you must also properly leverage the resources to execute these operations simultaneously.
Parallel computing introduces new concerns that are not present in a serial world. We need to change our thought processes to adapt to the additional complexities of parallel execution, but with practice, this becomes second nature. This book begins your discovery in how to access the power of parallel computing.
Life presents numerous examples of parallel processing, and these instances often become the basis for computing strategies. Figure 1.1 shows a supermarket checkout line, where the goal is to have customers quickly pay for the items they want to purchase. This can be done by employing multiple cashiers to process, or check out, the customers one at a time. In this case, the skilled cashiers can more quickly execute the checkout process so customers leave faster. Another strategy is to employ many self-checkout stations and allow customers to execute the process on their own. This strategy requires fewer human resources from the supermarket and can open more lanes to process customers. Customers may not be able to check themselves out as efficiently as a trained cashier, but perhaps more customers can check out quickly due to increased parallelism resulting in shorter lines.
We solve computational problems by developing algorithms: a set of steps to achieve a desired result. In the supermarket analogy, the process of checking out is the algorithm. In this case, it includes unloading items from a basket, scanning the items to obtain a price, and paying for the items. This algorithm is sequential (or serial); it must follow this order. If there are hundreds of customers that need to execute this task, the algorithm for checking out many customers contains a parallelism that can be taken advantage of. Theoretically, there is no dependency between any two customers going through the checkout process. By using multiple checkout lines or self-checkout stations, supermarkets expose parallelism, thereby increasing the rate at which customers buy goods and leave the store. Each choice in how we implement this parallelism results in different costs and benefits.

Figure 1.1 Everyday parallelism in supermarket checkout queues. The checkout cashiers (with caps) process their queue of customers (with baskets). On the left, one cashier processes four self...


  1. Parallel and High Performance Computing
  2. Copyright
  3. Dedication
  4. contents
  5. front matter
  6. Part 1 Introduction to parallel computing
  7. 1 Why parallel computing?
  8. 2 Planning for parallelization
  9. 3 Performance limits and profiling
  10. 4 Data design and performance models
  11. 5 Parallel algorithms and patterns
  12. Part 2 CPU: The parallel workhorse
  13. 6 Vectorization: FLOPs for free
  14. 7 OpenMP that performs
  15. 8 MPI: The parallel backbone
  16. Part 3 GPUs: Built to accelerate
  17. 9 GPU architectures and concepts
  18. 10 GPU programming model
  19. 11 Directive-based GPU programming
  20. 12 GPU languages: Getting down to basics
  21. 13 GPU profiling and tools
  22. Part 4 High performance computing ecosystems
  23. 14 Affinity: Truce with the kernel
  24. 15 Batch schedulers:Bringing order to chaos
  25. 16 File operations for a parallel world
  26. 17 Tools and resources for better code
  27. Appendix A. References
  28. Appendix B. Solutions to exercises
  29. Appendix C. Glossary
  30. index
Estilos de citas para Parallel and High Performance Computing

APA 6 Citation

Robey, R., & Zamora, Y. (2021). Parallel and High Performance Computing ([edition unavailable]). Manning Publications. Retrieved from (Original work published 2021)

Chicago Citation

Robey, Robert, and Yuliana Zamora. (2021) 2021. Parallel and High Performance Computing. [Edition unavailable]. Manning Publications.

Harvard Citation

Robey, R. and Zamora, Y. (2021) Parallel and High Performance Computing. [edition unavailable]. Manning Publications. Available at: (Accessed: 15 October 2022).

MLA 7 Citation

Robey, Robert, and Yuliana Zamora. Parallel and High Performance Computing. [edition unavailable]. Manning Publications, 2021. Web. 15 Oct. 2022.