# lora_s **Repository Path**: sanjin998/lora_s ## Basic Information - **Project Name**: lora_s - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-05-02 - **Last Updated**: 2026-05-03 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 🚀 LoRA-S: An Efficient Low Rank Adaptation Scheme via Sylvester Equation This repo contains the source code for experiments for our paper *LoRA-S: An Efficient Low Rank Adaptation scheme via Sylvester equation*
*Jinyang ZHENG, Tong WU*
📄 Paper: [https://openreview.net/pdf?id=Guo2XGgxZA](https://openreview.net/pdf?id=Guo2XGgxZA)
Numerous studies on low-rank adaptation (LoRA) emerged in recent years, with the aim of accelerating the convergence of the LoRA framework. In this paper, we leverage the horizontal lift theory from differential geometry to establish the general iteration scheme on the quotient manifold \( \mathbb{R}_{*}^{m \times r} \times \mathbb{R}_{*}^{n \times r} / \sim \). By endowing the LoRA framework with Riemannian quotient geometries, our theory not only guarantees efficient feature learning but also bridges the LoRA algorithms and the pre-training algorithms for large models.

Furthermore, we theoretically analyze the role of the weight decay matrix \( \epsilon_{decay} I \) in efficient feature learning and then replace it with the Sylvester matrix \( K \), indicating that the theory helps remove an important hyperparameter while generating accurate and computationally efficient optimizers. Based on the general scheme, we propose two efficient LoRA optimizers with runtime analysis, **Adam-Sylvester (AdamS)** and **LRACS**, then conduct experiments on the transformer-based networks. The results demonstrate evident improvements over existing optimizers.

## 📚 Repository Overview In this project, we experiments with GPT-2 fine-tuning and Mix-of-Show fine-tuning. ### 1. 🤖 GPT-2 Fine-Tuning See [GPT-2/](GPT2) for experiment code.

### 2. 🎨 Mix-of-Show Fine-Tuning See [Mix-of-Show/](Mix-of-Show) for experiment code.

## 🔁 Reproducibility See Parameter Reference in each section for parameter choices for each experiment. ## 📬 Contact Please contact us or post an issue if you have any questions. * Jinyang ZHENG ( jzhengbp@connect.ust.hk ) ## 🙏 References and Acknowledgements This work has been heavily influenced by recent development in low-rank matrix optimization research and parameter-efficient fine-tuning (PEFT) research. We cite several important references here with a more complete reference list presented in our [paper](https://openreview.net/pdf?id=Guo2XGgxZA). Moreover, our experimental code is mainly built on the following repositories: - [LoRA (Hu et al., 2021)](https://arxiv.org/abs/2106.09685) - [Mix-of-Show (Gu et al., 2023)](https://arxiv.org/abs/2305.18292) - [Riemannian_Preconditioned_LoRA (Zhang et al., 2024)](https://github.com/pilancilab/Riemannian_Preconditioned_LoRA) - [LoRA-Rite (Yen et al., 2025)](https://github.com/gkevinyen5418/LoRA-RITE) ## 📝 Citation ```BibTeX @inproceedings{ zheng2026loras, title={Lo{RA}-S: An Efficient Low Rank Adaptation scheme via Sylvester equation}, author={Jinyang ZHENG and Tong Wu}, booktitle={The Fourteenth International Conference on Learning Representations}, year={2026}, url={https://openreview.net/forum?id=Guo2XGgxZA} }