Benjamin Fattori
Home
Blog
14 Jan, 2026
Implementing a 2:4 Sparse GEMM Kernel with Tensor Cores
14 Oct, 2025
Sectors, Coalescing and Vector Loads in CUDA
26 Dec, 2024
A fast RG-LRU Kernel
31 Mar, 2023
ZeRO Optimizer Sharding with xmap and pjit
#potofhoney