# lectures **Repository Path**: btail-cat_-doc/lectures ## Basic Information - **Project Name**: lectures - **Description**: cuda 教程 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-06-28 - **Last Updated**: 2024-06-28 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Supplementary Material for Lectures discord.gg/cudamode The PMPP Book: [Programming Massively Parallel Processors: A Hands-on Approach](https://a.co/d/2S2fVzt) (Amazon link) [YouTube Channel](https://www.youtube.com/@CUDAMODE) ## Lecture 1: Profiling and Integrating CUDA kernels in PyTorch - [Video](https://youtu.be/LuhJEEJQgUM) - Date: 2024-01-13, Speaker: [Mark Saroufim](https://twitter.com/marksaroufim) - Notebook and slides in [lecture_001](./lecture_001/) folder ## Lecture 2: Recap Ch. 1-3 from the PMPP book - [Video](https://youtu.be/NQ-0D5Ti2dc) - Date: 2024-01-20, Speaker: [Andreas Koepf](https://twitter.com/neurosp1ke) - Slides: The powerpoint file [lecture_002/cuda_mode_lecture2.pptx](./lecture_002/cuda_mode_lecture2.pptx) can be found in the root directory of this repository. Alternatively [here](https://docs.google.com/presentation/d/1deqvEHdqEC4LHUpStO6z3TT77Dt84fNAvTIAxBJgDck/edit#slide=id.g2b1444253e5_1_75) as Google docs presentation. ## Lecture 3: Getting Started With CUDA - [Video](https://youtu.be/4sgKnKbR-WE) - Date: 2024-01-27, Speaker: [Jeremy Howard](https://twitter.com/jeremyphoward) - Notebook: See the [lecture_003](./lecture_003/) folder, or run the [Colab version](https://colab.research.google.com/drive/180uk6frvMBeT4tywhhYXmz3PJaCIA_uk?usp=sharing) ## Lecture 4: Intro to Compute and Memory Architecture - [Video](https://youtu.be/lTmYrKwjSOU) - Date: 2024-02-03, Speaker: [Thomas Viehmann](https://lernapparat.de/) - Notebook and slides in the [lecture_004](./lecture_004/) folder. ## Lecture 5: Going Further with CUDA for Python Programmers - [Video](https://youtu.be/wVsR-YhaHlM) - Date: 2024-02-10, Speaker: [Jeremy Howard](https://twitter.com/jeremyphoward) - Notebook in the [lecture_005](./lecture_005/) folder. ## Lecture 6: Optimizing PyTorch Optimizers - [Video](https://www.youtube.com/watch?v=hIop0mWKPHc) - Date: 2024-02-17, Speaker: [Jane Xu](https://github.com/janeyx99) - [Slides](https://docs.google.com/presentation/d/13WLCuxXzwu5JRZo0tAfW0hbKHQMvFw4O/edit#slide=id.p1) ## Lecture 7: Advanced Quantization - [Video](https://www.youtube.com/watch?v=1u9xUK3G4VM) - Date: 2024-02-25, Speaker: [Charles Hernandez](https://github.com/HDCharles) - [Slides](https://www.dropbox.com/scl/fi/hzfx1l267m8gwyhcjvfk4/Quantization-Cuda-vs-Triton.pdf?rlkey=s4j64ivi2kpp2l0uq8xjdwbab&dl=0) ## Lecture 8: CUDA Performance Checklist - [Video](https://www.youtube.com/watch?v=SGhfUhlowB4) - Date: 2024-03-09, Speaker: [Mark Saroufim](https://github.com/msaroufim) - Code in the [lecture_008](./lecture_008/) folder - [Slides](https://docs.google.com/presentation/d/1cvVpf3ChFFiY4Kf25S4e4sPY6Y5uRUO-X-A4nJ7IhFE/edit?usp=sharing) ## Lecture 9: Reductions - [Video](https://www.youtube.com/watch?v=09wntC6BT5o) - Date: 2024-03-09, Speaker: [Mark Saroufim](https://github.com/msaroufim) - Code in the [lecture_009](./lecture_009/) folder - [Slides](https://docs.google.com/presentation/d/1s8lRU8xuDn-R05p1aSP6P7T5kk9VYnDOCyN5bWKeg3U/edit?usp=drive_link) ## Lecture 10: Build a Prod Ready CUDA Library * [Video](https://www.youtube.com/watch?v=FHsEW0HpuoU) * Date: 2024-03-16, Speaker: [Oscar Amoros Huguet](https://github.com/morousg) * [slides](https://drive.google.com/drive/folders/158V8BzGj-IkdXXDAdHPNwUzDLNmr971_?usp=drive_link) ## Lecture 11: Sparsity * [Video](https://youtu.be/mGDnOLcfE8g) * Date: 2024-03-23, Speaker: [Jesse Cai](https://github.com/jcaip) * [Slides](./lecture_011/sparsity.pptx) ## Lecture 12: Flash Attention - [Video](https://www.youtube.com/watch?v=zEuwuCTEf_0) - Date: 2024-03-30, Speaker: [Thomas Viehmann](https://lernapparat.de/) ## Lecture 13: Ring Attention - [Video](https://www.youtube.com/watch?v=ws7angQYIxI) - Date: 2024-04-06, Speaker: [Andreas Koepf](https://twitter.com/neurosp1ke) - [Slides](./lecture_013/ring_attention.pptx) ## Lecture 14: Practitioner's Guide to Triton - [Video](https://www.youtube.com/watch?v=DdTsX6DQk24) - Date: 2024-04-13, Speaker: [Umer Adil](https://twitter.com/UmerHAdil) - [Notebook](./lecture_014/A_Practitioners_Guide_to_Triton.ipynb) ## Lecture 15: CUTLASS - Date: 2024-04-20, Speaker: [Eric Auld](https://github.com/ericauld) ## Lecture 16: On Hands profiling - Date: 2024-04-27, Speaker: [Taylor Robbie](https://www.linkedin.com/in/taylor-robie/) ## Bonus Lecture: CUDA C++ llm.cpp - Date: 2024-04-27, Speaker: [Jake Hemstad & Georgii Evtushenko]() - [Slides](https://drive.google.com/drive/folders/1T-t0d_u0Xu8w_-1E5kAwmXNfF72x-HTA) ## Lecture 17: GPU Collective Communication (NCCL) - Date: 2024-05-04, Speaker: [Dan Johnson](https://physbam.stanford.edu/~dansj/) - Code in the [lecture_017](./lecture_017/) folder ## Lecture 18: Fused Kernels - Date: 2024-05-11, Speaker: [Kapil Sharma](https://www.kapilsharma.dev/) - Code in the [lecture_018](./lecture_018/) folder ## Lecture 19: Data Processing on GPUs - Date: 2024-05-18, Speaker: [Devavret Makkar](https://github.com/devavret) ## Lecture 20: Scan Algorithm - Date: 2024-05-25, Speaker: [Izzat El Haj](https://ielhajj.github.io/) - [Slides](https://docs.google.com/presentation/d/1MEMsE5LKi6ush_60hlYu3-cz4DUCFzSL/edit?usp=sharing&ouid=106222972308395582904&rtpof=true&sd=true) ## Lecture 21: Scan Algorithm Part 2 - Date: 2024-05-31, Speaker: [Izzat El Haj](https://ielhajj.github.io/) - [Slides](https://docs.google.com/presentation/d/1MEMsE5LKi6ush_60hlYu3-cz4DUCFzSL/edit?usp=sharing&ouid=106222972308395582904&rtpof=true&sd=true) ## Lecture 22: Hacker's Guide to Speculative Decoding in VLLM - Date: 2024-06-01, Speaker: [Cade Daniel](https://x.com/cdnamz) - [Slides](https://docs.google.com/presentation/d/1p1xE-EbSAnXpTSiSI0gmy_wdwxN5XaULO3AnCWWoRe4/edit#slide=id.p) ## Lecture 23: Tensor Cores - Date: 2024-06-07, Speaker: [Vijay Thakkar & Pradeep Ramani]() - [Slides](https://drive.google.com/file/d/18sthk6IUOKbdtFphpm_jZNXoJenbWR8m/view) ## Lecture 24: Scan at the Speed of Light - Date: 2024-06-08, Speaker: [Jake Hemstad & Georgii Evtushenko]() - [Slides](TODO)