Fundamentals of Accelerated Computing with OpenACC
6 Labs 8h 35m 115 Credits
Learn the basics of OpenACC, a high-level programming language for programming on GPUs. This course is for anyone with some C/C++ experience who is interested in accelerating the performance of their applications beyond the limits of CPU-only programming. In this course, you’ll learn: • Four simple steps to accelerating your already existing application with OpenACC • How to profile and optimize your OpenACC codebase • How to program on multi-GPU systems by combining OpenACC with MPI Upon completion, you’ll be able to build and optimize accelerated heterogeneous applications on multiple GPU clusters using a combination of OpenACC, CUDA-aware MPI, and NVIDIA profiling tools.
Learn how to accelerate your C/C++ or Fortran application using OpenACC to harness the massively parallel power of NVIDIA GPUs. OpenACC is a directive based approach to computing where you provide compiler hints to accelerate your code, instead of writing the accelerator code yourself. In 90 minutes, you will experience a four-step process for accelerating applications using OpenACC: Characterize and profile your application Add compute directives Add directives to optimize data movement Optimize your application using kernel scheduling
If you have not already registered, please sign-up for the OpenACC Lab series. It is highly recommended that you have basic understanding of programming with OpenACC. If you do not, try the OpenACC - 2X in 4 Steps lab first! In this lab participants will gain experience with the first two steps of the OpenACC programming cycle: Identify and Express Parallelism. Participants will profile a provided C or Fortran application using NVIDIA NVPROF and use the PGI OpenACC compiler to accelerate the code. This lab is intended to be taken after lecture 2 of the OpenACC course provided by NVIDIA.
If you have not already registered, please sign-up for the OpenACC Lab series. It is highly recommended that you have basic understanding of programming with OpenACC. If you do not, try the OpenACC - 2X in 4 Steps lab first! This lab continues to work completed in the lab "Profiling and Parallelizing" by adding OpenACC data management directives and then optimizing the code using the OpenACC loop directive. Participants will use the PGI compiler and NVIDIA Visual Profiler to optimize the code. This lab is intended to be taken after completing the previous lab and after watching lecture 3 of the free OpenACC course provided by NVIDIA
OpenACC is a high-level language for programming GPUs using compiler hints. With OpenACC a programmer can take advantage of the benefits of GPUs with little code change and incremental improvements to their existing code. This lab is intended for existing OpenACC programmers to take their OpenACC skills to the next level by optimizing data copies to be overlapped with GPU computation using a simple technique known as pipelining. When it’s impossible to completely eliminate the need to copy data to and from the GPU memory, pipelining makes it possible to make these copies nearly free. In 90 minutes, you will work through a number of exercises including: Using the OpenACC routine directive to allow on-device function calls. Breaking up large work into bite-sized pieces. Working on these pieces asynchronously from the CPU Overlapping GPU computation and PCIe data motion Some OpenACC experience is required to take this lab. For an introduction to OpenACC, please see our other labs.
warning Introduction to Multi GPU Programming with MPI and OpenACC
In this lab you will learn how to program multi GPU systems or GPU clusters using the Message Passing Interface (MPI) and OpenACC. Basic knowledge of MPI and OpenACC is a prerequisite. The topics covered by this lab are: Exchanging data between different GPUs using CUDA-aware MPI and OpenACC Handle GPU affinity in multi GPU systems Overlapping communication with computation to hide communication times Optionally how to use the NVIDIA performance analysis tools Recommended prerequisites for this lab are: C or Fortran, basic OpenACC and basic MPI.
warning Advanced Multi GPU Programming with MPI and OpenACC
In this self-paced, hands-on lab, you will learn how to improve a multi GPU MPI+OpenACC program. It is a follow-up lab of the Introduction to Multi GPU Programming with MPI and OpenACC lab. Knowledge on how to program multiple GPUs with MPI and OpenACC is a prerequisite. The topics covered by this lab are Overlapping communication with computation to hide communication times Handling noncontiguous halo updates with a 2D tiled domain decomposition Recommended prerequisites C or Fortran, basic OpenACC and basic MPI.