Practical GPU Programming: High-performance computing with CUDA, CuPy, and Python on modern GPUs - Paperback

$84.65

Practical GPU Programming: High-performance computing with CUDA, CuPy, and Python on modern GPUs - Paperback

Name: Practical GPU Programming: High-performance computing with CUDA, CuPy, and Python on modern GPUs - Paperback
Brand: Books by splitShops
SKU: 9789349174795
Price: 84.65 USD
Availability: InStock

$84.65

Report copyright infringement

by Maris Fenlor (Author)

If you're a Python pro looking to get the most out of your code with GPUs, then Practical GPU Programming is the right book for you. This book will walk you through the basics of GPU architectures, show you hands-on parallel programming techniques, and give you the know-how to confidently speed up real workloads in data processing, analytics, and engineering.

The first thing you'll do is set up the environment, install CUDA, and get a handle on using Python libraries like PyCUDA and CuPy. You'll then dive into memory management, kernel execution, and parallel patterns like reductions and histogram computations. Then, we'll dive into sorting and search techniques, but with a focus on how GPU acceleration transforms business data processing. We'll also put a strong emphasis on linear algebra to show you how to supercharge classic vector and matrix operations with cuBLAS and CuPy. Plus, with batched computations, efficient broadcasting, custom kernels, and mixed-library workflows, you can tackle both standard and advanced problems with ease.

Throughout, we evaluate numerical accuracy and performance side by side, so you can understand both the strengths and limitations of GPU-based solutions. The book covers nearly every essential skill and modern toolkit for practical GPU programming, but it's not going to turn you into a master overnight.

Key Learnings

Boost processing speed and efficiency for data-intensive tasks.
Use CuPy and PyCUDA to write and execute custom CUDA kernels.
Maximize GPU occupancy and throughput efficiency by using optimal thread block and grid configuration.
Reduce global memory bottlenecks in kernels by using shared memory and coalesced access patterns.
Perform dynamic kernel compilation to ensure tailored performance.
Use CuPy to carry out custom, high-speed elementwise GPU operations and expressions.
Implement bitonic and radix sort algorithms for large or batch integer datasets.
Execute parallel linear search kernels to detect patterns rapidly.
Scale matrix operations using Batched GEMM and high-level cuBLAS routines.

Table of Content

Introduction to GPU Fundamentals
Setting up GPU Programming Environment
Basic Data Transfers and Memory Types
Simple Parallel Patterns
Introduction to Kernel Optimization
Working with PyCUDA and CuPy Features
Practical Sorting and Search
Linear Algebra Essentials on GPU

Number of Pages: 130

Dimensions: 0.28 x 9.25 x 7.5 IN

Publication Date: February 20, 2025

Intentional design

We make things that work better and last longer. Our products solve real problems with clean design.

Quality first

We obsess over the details and strive to deliver the best products at the best prices, every time.

Customer care

We're always on your side: keeping our loyal customers happy is our top priority and number one goal.

Feature 1

Made with care and unconditionally loved by our customers, this signature bestseller exceeds all expectations.

Feature 2

Made with care and unconditionally loved by our customers, this signature bestseller exceeds all expectations.

At the heart of every product lies a unique story, driven by our passion for quality and innovation. Each item enhances your everyday life and sparks joy.