# kokoro-test

**Repository Path**: hiaspx/kokoro-test

## Basic Information

- **Project Name**: kokoro-test
- **Description**: No description available
- **Primary Language**: Python
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-01-16
- **Last Updated**: 2026-01-16

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Kokoro-Air TTS 服务

基于 Kokoro-82M 模型的文本转语音 (TTS) API 服务，使用 FastAPI 构建。

## 功能特性

- 🚀 基于 FastAPI 的高性能 REST API
- 🎯 支持多种语音包
- ⚡ 支持 GPU 加速（CUDA）
- 📦 支持 ONNX 推理（可选）
- 🔧 可配置的语速和音频参数
- 📝 完整的日志记录

## 快速开始

### 0. 安装系统依赖（首次使用）

如果系统缺少 `python3-pip` 或 `python3-venv`，请先安装：

```bash
# Ubuntu/Debian 系统
sudo apt update
sudo apt install -y python3-pip python3-venv
```

### 1. 安装 Python 依赖

**提示**: 启动脚本已默认使用清华大学镜像源加速下载。如需配置全局 pip 镜像源，可运行：
```bash
./setup_pip_mirror.sh
```

#### 方式一：使用虚拟环境（推荐）

```bash
# 创建虚拟环境
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 安装依赖（使用国内镜像源）
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
```

#### 方式二：使用系统 Python（如果无法创建虚拟环境）

```bash
# 直接安装到用户目录（使用国内镜像源）
python3 -m pip install --user -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
```

#### 方式三：配置全局 pip 镜像源（推荐）

```bash
# 运行配置脚本
./setup_pip_mirror.sh

# 或手动配置
mkdir -p ~/.pip
cat > ~/.pip/pip.conf << EOF
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
trusted-host = pypi.tuna.tsinghua.edu.cn
EOF
```

配置后，所有 pip 命令都会自动使用镜像源，无需每次添加 `-i` 参数。

### 2. 配置环境

复制 `.env.example` 为 `.env` 并修改配置：

```bash
cp .env.example .env
```

编辑 `.env` 文件，设置模型路径等配置。

### 3. 下载模型

#### 方式一：自动下载（推荐）

模型会在首次启动时自动下载。如果自动下载失败，可以使用下载脚本：

```bash
# 使用下载脚本（支持镜像站）
python3 download_model.py
```

#### 方式二：手动下载

如果网络连接有问题，可以手动下载：

```bash
# 使用 huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='hexgrad/Kokoro-82M', local_dir='models/kokoro')"

# 或使用 git lfs
git lfs install
git clone https://huggingface.co/hexgrad/Kokoro-82M models/kokoro
```

#### 方式三：使用镜像站

如果在中国大陆，可以使用 Hugging Face 镜像站：

```bash
# 设置镜像环境变量
export HF_ENDPOINT=https://hf-mirror.com

# 然后运行下载脚本
python3 download_model.py
```

**注意**: 如果模型下载失败，服务仍会启动，但 TTS 功能将不可用。请确保模型文件在 `models/kokoro` 目录中。

### 4. 启动服务

#### 方式一：使用启动脚本（自动创建虚拟环境）

```bash
chmod +x start.sh
./start.sh
```

#### 方式二：使用简化启动脚本（系统 Python，无需虚拟环境）

```bash
chmod +x start_simple.sh
./start_simple.sh
```

#### 方式三：直接运行

```bash
# 如果使用虚拟环境
source venv/bin/activate
python main.py

# 或使用系统 Python
python3 main.py
```

**注意**：如果遇到 "No module named 'uvicorn'" 等错误，说明依赖未安装，请先完成步骤 1。

服务将在 `http://0.0.0.0:8000` 启动。

### 5. 访问 API 文档

启动后访问：
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc

## API 使用示例

### 1. 健康检查

```bash
curl http://localhost:8000/health
```

### 2. 获取可用语音列表

```bash
curl http://localhost:8000/voices
```

### 3. 文本转语音（返回音频文件）

```bash
curl -X POST "http://localhost:8000/synthesize" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "你好，这是 Kokoro TTS 测试",
    "voice": "default",
    "speed": 1.0
  }' \
  --output output.wav
```

### 4. 文本转语音（返回 base64）

```bash
curl -X POST "http://localhost:8000/synthesize_base64" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "你好，这是 Kokoro TTS 测试",
    "voice": "default",
    "speed": 1.0
  }'
```

### Python 客户端示例

```python
import requests

# 合成语音
response = requests.post(
    "http://localhost:8000/synthesize",
    json={
        "text": "你好，这是 Kokoro TTS 测试",
        "voice": "default",
        "speed": 1.0
    }
)

# 保存音频文件
with open("output.wav", "wb") as f:
    f.write(response.content)
```

## 配置说明

### 环境变量

| 变量名 | 说明 | 默认值 |
|--------|------|--------|
| `HOST` | 服务器主机 | `0.0.0.0` |
| `PORT` | 服务器端口 | `8000` |
| `DEBUG` | 调试模式 | `false` |
| `MODEL_NAME` | Hugging Face 模型名称 | `hexgrad/Kokoro-82M` |
| `MODEL_PATH` | 本地模型路径 | `models/kokoro` |
| `DEVICE` | 计算设备 (auto/cpu/cuda) | `auto` |
| `SAMPLE_RATE` | 音频采样率 | `24000` |

## 项目结构

```
kokoro-air-tts/
├── main.py              # FastAPI 主应用
├── model_loader.py      # 模型加载和推理模块
├── config.py            # 配置管理
├── requirements.txt     # Python 依赖
├── start.sh             # 启动脚本（使用虚拟环境）
├── start_simple.sh      # 简化启动脚本（系统 Python）
├── setup_pip_mirror.sh  # pip 镜像源配置脚本
├── pip.conf             # pip 配置文件示例
├── test_api.py          # API 测试脚本
├── .env.example         # 环境变量示例
├── models/              # 模型文件目录
├── outputs/             # 输出音频目录
└── logs/                # 日志目录
```

## 依赖说明

主要依赖：
- `torch`: PyTorch 深度学习框架
- `transformers`: Hugging Face 模型库
- `fastapi`: Web 框架
- `uvicorn`: ASGI 服务器
- `soundfile`: 音频文件处理
- `onnxruntime`: ONNX 推理（可选）

## 性能优化

1. **GPU 加速**: 确保安装了 CUDA 版本的 PyTorch
2. **ONNX 推理**: 使用 ONNX 模型可以提升推理速度
3. **批量处理**: 对于大量请求，考虑实现批量处理

## 故障排除

### 虚拟环境创建失败

**错误**: `The virtual environment was not created successfully because ensurepip is not available`

**解决方案**:
```bash
# 安装 python3-venv
sudo apt install -y python3-venv

# 或使用简化启动脚本（不需要虚拟环境）
./start_simple.sh
```

### pip 未找到

**错误**: `pip: 未找到命令` 或 `No module named pip`

**解决方案**:
```bash
# 安装 pip
sudo apt install -y python3-pip

# 或使用用户模式安装依赖
python3 -m pip install --user -r requirements.txt
```

### 模块未找到错误

**错误**: `ModuleNotFoundError: No module named 'uvicorn'`

**解决方案**:
```bash
# 确保已安装依赖（使用国内镜像源加速）
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

# 或使用用户模式
python3 -m pip install --user -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
```

### pip 下载速度慢

**解决方案**: 使用国内镜像源

1. **临时使用**（单次安装）:
   ```bash
   pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
   ```

2. **永久配置**（推荐）:
   ```bash
   # 使用配置脚本
   ./setup_pip_mirror.sh
   
   # 或手动配置
   mkdir -p ~/.pip
   cp pip.conf ~/.pip/pip.conf
   ```

3. **常用镜像源**:
   - 清华大学: `https://pypi.tuna.tsinghua.edu.cn/simple`
   - 阿里云: `https://mirrors.aliyun.com/pypi/simple/`
   - 中科大: `https://pypi.mirrors.ustc.edu.cn/simple/`
   - 豆瓣: `https://pypi.douban.com/simple/`

### 模型加载失败

- 检查模型路径是否正确
- 确保有足够的磁盘空间
- 检查网络连接（如果从 Hugging Face 下载）

### GPU 不可用

- 检查 CUDA 是否正确安装
- 验证 PyTorch 是否支持 CUDA: `python -c "import torch; print(torch.cuda.is_available())"`

### 内存不足

- 使用 CPU 模式: 设置 `DEVICE=cpu`
- 使用 ONNX 量化模型
- 减少 `max_text_length` 限制

## 许可证

本项目使用 Apache 2.0 许可证。Kokoro-82M 模型也使用 Apache 2.0 许可证。

## 贡献

欢迎提交 Issue 和 Pull Request！

## 参考链接

- [Kokoro-82M on Hugging Face](https://huggingface.co/hexgrad/Kokoro-82M)
- [FastAPI 文档](https://fastapi.tiangolo.com/)
- [PyTorch 文档](https://pytorch.org/docs/)