# aiter **Repository Path**: yunfeiliu/aiter ## Basic Information - **Project Name**: aiter - **Description**: https://github.com/ROCm/aiter - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-08-09 - **Last Updated**: 2026-03-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # aiter ![image](https://github.com/user-attachments/assets/9457804f-77cd-44b0-a088-992e4b9971c6) AITER is AMD’s centralized repository that support various of high performance AI operators for AI workloads acceleration, where a good unified place for all the customer operator-level requests, which can match different customers' needs. Developers can focus on operators, and let the customers integrate this op collection into their own private/public/whatever framework. Some summary of the features: * C++ level API * Python level API * The underneath kernel could come from triton/ck/asm * Not just inference kernels, but also training kernels and GEMM+communication kernels—allowing for workarounds in any kernel-framework combination for any architecture limitation. ## Installation ``` git clone --recursive https://github.com/ROCm/aiter.git cd aiter python3 setup.py develop ``` If you happen to forget the `--recursive` during `clone`, you can use the following command after `cd aiter` ``` git submodule sync && git submodule update --init --recursive ``` ### FlyDSL (Optional) AITER's FusedMoE supports [FlyDSL](https://pypi.org/project/flydsl/)-based kernels for mixed-precision MOE (e.g., A4W4). FlyDSL is optional — when not installed, AITER automatically falls back to CK kernels. ```bash pip install --pre flydsl ``` Or install all optional dependencies at once: ```bash pip install -r requirements.txt ``` ### Triton-based Communication (Iris) AITER supports GPU-initiated communication using the [Iris library](https://github.com/ROCm/iris). This enables high-performance Triton-based communication primitives like reduce-scatter and all-gather. **Installation** Install with Triton communication support: ```bash # Install AITER with Triton communication dependencies pip install -e . pip install -r requirements-triton-comms.txt ``` For more details, see [docs/triton_comms.md](docs/triton_comms.md). ## Run operators supported by aiter There are number of op test, you can run them with: `python3 op_tests/test_layernorm2d.py` | **Ops** | **Description** | |-------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------| |ELEMENT WISE | ops: + - * / | |SIGMOID | (x) = 1 / (1 + e^-x) | |AllREDUCE | Reduce + Broadcast | |KVCACHE | W_K W_V | |MHA | Multi-Head Attention | |MLA | Multi-head Latent Attention with [KV-Cache layout](https://docs.flashinfer.ai/tutorials/kv_layout.html#page-table-layout ) | |PA | Paged Attention | |FusedMoe | Mixture of Experts | |QUANT | BF16/FP16 -> FP8/INT4 | |RMSNORM | root mean square | |LAYERNORM | x = (x - u) / (σ2 + ϵ) e*0.5 | |ROPE | Rotary Position Embedding | |GEMM | D=αAβB+C |