cuda_fan

GPU enthusiast & CUDA developer

cuda_fan

Passionate about parallel computing, deep learning, and graphics programming.

Optimizing Matrix Multiplication with CUDA

Posted on July 12, 2025 • 5 min read

In this tutorial we dive deep into shared memory, warp shuffling, and loop unrolling techniques to squeeze every ounce of performance out of your GPUs.

Getting Started with cuDNN for Deep Learning

Posted on June 28, 2025 • 4 min read

Learn how to integrate cuDNN into your PyTorch workflow and achieve significant speedups on convolutional layers.

Building a Real‑Time Ray Tracer with OptiX

Posted on May 15, 2025 • 7 min read

Explore NVIDIA OptiX’s pipeline architecture to render stunning scenes at interactive frame rates.