# torch-blocksparse **Repository Path**: zongkw/torch-blocksparse ## Basic Information - **Project Name**: torch-blocksparse - **Description**: Block-sparse primitives for PyTorch - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-05-16 - **Last Updated**: 2024-06-09 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README WARNING: This project is now deprecated. Please use the `triton.ops.blocksparse` module in [Triton](https://github.com/ptillet/triton) # Torch-Blocksparse Block-sparse operations for PyTorch # Supported Operations The following features are supported: ``` Convolutions with block-sparse weights: Layout has format [K//block, C//block, R, S]. Padding/Stride supported. Sparse MultiHead Attention (https://arxiv.org/abs/1904.10509) Batched Matrix Multiplication: SPARSE = op(DENSE) x op(DENSE) Batched Matrix Multiplication: DENSE = op(SPARSE) x op(DENSE) Batched Matrix Multiplication: DENSE = op(DENSE) x op(SPARSE) Softmax: SPARSE = Softmax(SPARSE) ``` where `op()` is identity or transposition. Inputs are FP32 or FP16 (with tensor cores). ## Usage ```python import torch import torch_blocksparse # Z: non-sparse batch dimension # H: sparse batch dimension # M: row dimension # N: column dimension Z, H, M, N, K = 4, 2, 256, 512, 384 a = torch.rand((Z, H, M, K), dtype=torch.float32).cuda() b = torch.rand((Z, H, K, N), dtype=torch.float32).cuda() # create sparsity layout block = 16 layout = torch.randint(0, 2, (H, M//block, N//block)) # create object for Sparse = trans(Dense) x Dense (sdd) # some overhead there as it pre-computes look-up tables # internally needed by GPU kernels dot = torch_blocksparse.MatMul(layout, block, 'sdd', trans_a=True, trans_b=False) c = dot(a, b) # create object for Sparse = softmax(Sparse) softmax = torch_blocksparse.Softmax(layout, block) d = softmax(c) ```