Skip to content

The repository contains one cuda kernel each day :)

Notifications You must be signed in to change notification settings

mananchawla2005/gpukernels

Repository files navigation

GPU Programming Learning Journey

A collection of GPU kernels implemented one day at a time, progressing from basic to advanced concepts.

Prerequisites

  • NVIDIA GPU with CUDA support
  • CUDA Toolkit installed
  • Python 3.11+
  • PyTorch

Directory Structure

  • Day 1 - Basic Vector Addition in CUDA
  • Day 2 - Vector Addition with Python/PyTorch Bindings
  • Day 3 - RGB to Grayscale Conversion
  • Day 4 - RGB to Blurred Image Conversion
  • Day 5 - Simple Matrix Multiplication
  • Day 6 - Coalased Matrix Multiplication
  • Day 7 - GELU Activation function
  • Day 8 - NAIVE Batch Normalisation
  • Day 9 - Sigmoid Activation function
  • Day 10 - Tanh Activation function and Tiled Matrix Multiplication
  • Day 11 - Dynamic Tiled Matrix Multiplication
  • Day 12 - Layer Normalisation using Shared Memory
  • Day 13 - Matrix Transpose
  • Day 14 - Softmax using shared memory
  • Day 15 - GELU Forward and Backward Kernels
  • Day 16 - Querying Gpu Properties
  • Day 17 - Custom NF4 Quantization Implementation

About

The repository contains one cuda kernel each day :)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published