# transact-pruning **Repository Path**: mirrors_XiaoMi/transact-pruning ## Basic Information - **Project Name**: transact-pruning - **Description**: This repository is the official implementation of the ACL 2024 paper: Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations. - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-11-29 - **Last Updated**: 2026-03-22 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # TransAct This repository is the official implementation of the ACL 2024 paper [Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations](https://aclanthology.org/2024.findings-acl.582/). | ![transact.png](assets/transact.png) | ![latency.png](assets/latency.png) | | :----------------------------------: | :--------------------------------: | | Pruning architecture | Latency on Xiaomi 14 mobile phone | ## Training and Evaluation 1. Prepare environment following `transact.dockerfile`. 2. Create symlink to your data, models, outputs, and HuggingFace cache (mainly for large [datasets](https://github.com/huggingface/datasets)). ```sh ln -s /path/to/data data ln -s /path/to/models models ln -s /path/to/outputs outputs ln -s /path/to/hf-cache hf-cache ``` 3. Tweak training configs `train_config.yaml` and `deepspeed.json`. 4. Run the train script `run_trainer.sh`, for example ```sh bash run_trainer.sh -m all \ -a 768 -f 1536 \ -n 128 -k 8 -p acts \ -l 4096 -t 50 \ -g 64 -b 4 \ -d togethercomputer/RedPajama-Data-1T \ -x llama -y llama2 -z 7B ``` Run `bash run_trainer.sh -h` for help. 5. Run the evaluation script `eval.sh`, for example ```sh bash eval.sh -m all \ -a 768 -f 1536 \ -n 128 -k 8 -p acts \ -l 4096 -t 50 \ -d togethercomputer/RedPajama-Data-1T \ -x llama -y llama2 -z 7B ``` Run `bash eval.sh -h` for help. ## Citations Please cite the paper if this repository is useful for you. ``` @inproceedings{shen-etal-2024-pruning, title = "Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations", author = "Shen, Bowen and Lin, Zheng and Zha, Daren and Liu, Wei and Luan, Jian and Wang, Bin and Wang, Weiping", booktitle = "Findings of the Association for Computational Linguistics ACL 2024", year = "2024", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.findings-acl.582", doi = "10.18653/v1/2024.findings-acl.582", pages = "9781--9793", } ```