# so-vits-svc-fork **Repository Path**: geekwish/so-vits-svc-fork ## Basic Information - **Project Name**: so-vits-svc-fork - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-01-19 - **Last Updated**: 2024-05-30 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # SoftVC VITS Singing Voice Conversion

License

基于 [`so-vits-svc4.0(V1)`](https://github.com/svc-develop-team/so-vits-svc)的一个分支，支持实时推理和图形化推理界面，且兼容其模型。 ## 新功能 - **实时语音转换** (增强版本 v1.1.0) - 与[`QuickVC`](https://github.com/quickvc/QuickVC-VoiceConversion)相结合 - 修复了原始版本中对 [`ContentVec`](https://github.com/auspicious3000/contentvec) 的误用[^c] - 使用 CREPE 进行更准确的音高推测 - 图形化界面和统一命令行界面 - 相比之前双倍的训练速度 - 只需使用 `pip` 安装即可使用，不需要安装 `fairseq` - 自动下载预训练模型和 HuBERT 模型 - 使用 black、isort、autoflake 等完全格式化的代码 [^c]: [#206](https://github.com/34j/so-vits-svc-fork/issues/206) ## 安装教程 ### 可以使用 bat 一键安装

### 本 bat 汉化基于英文版，对原版进行了一些本地工作和优化，如安装过程有问题，可以尝试安装原版

### 手动安装

创建一个虚拟环境

Windows: ```shell py -3.10 -m venv venv venv\Scripts\activate ``` Linux/MacOS: ```shell python3.10 -m venv venv source venv/bin/activate ``` Anaconda: ```shell conda create -n so-vits-svc-fork python=3.10 pip conda activate so-vits-svc-fork ``` 如果 Python 安装在 Program Files，在安装时未创造虚拟环境可能会导致`PermissionError`

### 安装通过 pip 安装 (或者通过包管理器使用 pip 安装): ```shell python -m pip install -U pip setuptools wheel pip install -U torch torchaudio --index-url https://download.pytorch.org/whl/cu118 pip install -U so-vits-svc-fork ``` - 如果没有可用 GPU 或使用 MacOS, 不需要执行 `pip install -U torch torchaudio --index-url https://download.pytorch.org/whl/cu118`. MPS 可能已经安装了. - 如果在 Linux 下使用 AMD GPU, 请使用此命令 `--index-url https://download.pytorch.org/whl/rocm5.4.2` 替换掉 `--index-url https://download.pytorch.org/whl/cu118` . Windows 下不支持 AMD GPUs (#120). ### 更新请经常更新以获取最新功能和修复错误: ```shell pip install -U so-vits-svc-fork ``` ## 使用教程 ### 推理 #### 图形化界面 ![GUI](https://raw.githubusercontent.com/34j/so-vits-svc-fork/main/docs/_static/gui.png) 请使用以下命令运行图形化界面: ```shell svcg ``` #### 命令行界面 - 实时转换 (输入源为麦克风) ```shell svc vc ``` - 从文件转换 ```shell svc infer source.wav ``` [预训练模型](https://huggingface.co/models?search=so-vits-svc-4.0) 可以在 HuggingFace 获得。 #### 注意 - 如果使用 WSL, 请注意 WSL 需要额外设置来处理音频，如果 GUI 找不到音频设备将不能正常工作。 - 在实时语音转换中, 如果输入源有杂音, HuBERT 模型依然会把杂音进行推理.可以考虑使用实时噪音减弱程序比如 [RTX Voice](https://www.nvidia.com/en-us/geforce/guides/nvidia-rtx-voice-setup-guide/) 来解决. ### 训练 #### 预处理 - 如果数据集有 BGM,请用例如[Ultimate Vocal Remover](https://ultimatevocalremover.com/)等软件去除 BGM. 推荐使用`3_HP-Vocal-UVR.pth` 或者 `UVR-MDX-NET Main` . [^1] - 如果数据集是包含单个歌手的长音频文件, 使用 `svc pre-split` 将数据集拆分为多个文件 (使用 `librosa`). - 如果数据集是包含多个歌手的长音频文件, 使用 `svc pre-sd` 将数据集拆分为多个文件 (使用 `pyannote.audio`) 。为了提高准确率，可能需要手动进行分类。如果歌手的声线多样,请把 --min-speakers 设置为大于实际说话者数量. 如果出现依赖未安装, 请通过 `pip install pyannote-audio`来安装 `pyannote.audio`。 [^1]: https://ytpmv.info/how-to-use-uvr/ #### 云端 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/34j/so-vits-svc-fork/blob/main/notebooks/so-vits-svc-fork-4.0.ipynb) [![Open In Paperspace](https://img.shields.io/badge/Open%20in-Paperspace-blue?style=flat-square&logo=paperspace)](https://console.paperspace.com/github/34j/so-vits-svc-fork-paperspace/blob/main/so-vits-svc-fork-4.0-paperspace.ipynb) [![Paperspace Referral]()](https://www.paperspace.com/?r=9VJN74I)[^p] 如果你无法获取 10GB 显存以上的显卡，对于轻量用户，推荐使用 Google Colab 的免费方案；而重度用户，则建议使用 Paperspace 的 Pro/Growth Plan。当然，如果你有高端的显卡，就没必要使用云服务了。 [^p]: If you register a referral code and then add a payment method, you may save about $5 on your first month's monthly billing. Note that both referral rewards are Paperspace credits and not cash. It was a tough decision but inserted because debugging and training the initial model requires a large amount of computing power and the developer is a student. #### 本地将数据集处理成 `dataset_raw/{speaker_id}/**/{wav_file}.{any_format}` 的格式(可以使用子文件夹和非 ASCII 文件名)然后运行: ```shell svc pre-resample svc pre-config svc pre-hubert svc train -t ``` #### 注意 - 数据集的每个文件应该小于 10s，不然显存会爆。 - 建议在执行 `train` 命令之前提高 `config.json` 中的 `batch_size` 以匹配显存容量。将`batch_size`设为`auto-{init_batch_size}-{max_n_trials}`（或者只需设为`auto`）就会自动提高`batch_size`，直到爆显存为止（不过自动调高 batch_size 有概率失效） - 如果想要 f0 的推理方式为 `CREPE`, 用 `svc pre-hubert -fm crepe` 替换 `svc pre-hubert`. - 若想正确使用`ContentVec`，用 `-t so-vits-svc-4.0v1`替换`svc pre-config`。由于复用 generator weights，一些 weights 会被重置而导致训练时间稍微延长. - 若要使用`MS-iSTFT Decoder`，用 `svc pre-config -t quickvc`替换 `svc pre-config`. - 在原始仓库中，会自动移除静音和进行音量平衡，且这个操作并不是必须要处理的。 - 倘若你已经大规模训练了一个免费公开版权的数据集，可以考虑将其作为底模发布。 - 对于更多细节（比如参数等），详见[Wiki](https://github.com/34j/so-vits-svc-fork/wiki) 或 [Discussions](https://github.com/34j/so-vits-svc-fork/discussions). ### 帮助更多命令, 运行 `svc -h` 或者 `svc -h` ```shell > svc -h 用法: svc [OPTIONS] COMMAND [ARGS]... so-vits-svc 允许任何文件夹结构用于训练数据但是, 建议使用以下文件夹结构训练: dataset_raw/{speaker_name}/**/{wav_name}.{any_format} 推理: configs/44k/config.json, logs/44k/G_XXXX.pth 如果遵循文件夹结构,则无需指定模型路径,配置路径等,将自动加载最新模型若要要训练模型, 运行 pre-resample, pre-config, pre-hubert, train. 若要要推理模型, 运行 infer. 可选: -h, --help 显示信息并退出命令: clean 清理文件,仅在使用默认文件结构时有用 infer 推理 onnx 导出模型到onnx pre-config 预处理第 2 部分: config pre-hubert 预处理第 3 部分: 如果没有找到 HuBERT 模型,则会... pre-resample 预处理第 1 部分: resample pre-sd Speech diarization 使用 pyannote.audio pre-split 将音频文件拆分为多个文件 train 训练模型如果 D_0.pth 或 G_0.pth 没有找到,自动从集线器下载. train-cluster 训练 k-means 聚类模型 vc 麦克风实时推理 ``` #### 补充链接 [视频教程](https://www.youtube.com/watch?v=tZn0lcGO5OQ) ## Contributors ✨ Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):

_34j 💻 🤔 📖 💡 🚇 🚧 👀 ⚠️ ✅ 📣 🐛	_{GarrettConway} 💻 🐛 📖 👀	_BlueAmulet 🤔 💬 💻 🚧	_{ThrowawayAccount01} 🐛	_緋 📖 🐛	_Lordmau5 🐛 💻 🤔 🚧 💬 📓	_DL909 🐛
_Satisfy256 🐛	_{Pierluigi Zagaria} 📓	_{ruckusmattster} 🐛	_Desuka-art 🐛	_heyfixit 📖	_{Nerdy Rodent} 📹	_谢宇 📖
_ColdCawfee 🐛	_sbersier 🤔 📓 🐛	_Meldoner 🐛	_mmodeusher 🐛	_AlonDan 🐛	_Likkkez 🐛	_{Duct Tape Games} 🐛
_{Xianglong He} 🐛	_75aosu 🐛	_tonyco82 🐛	_yxlllc 🤔 💻	_outhipped 🐛	_{escoolioinglesias} 🐛 📓 📹	_Blacksingh 🐛
_{Mgs. M. Thoyib Antarnusa} 🐛	_Exosfeer 🐛 💻	_guranon 🐛 🤔 💻	_{Alexander Koumis} 💻

This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!