# local-multilingual-voice-translator

**Repository Path**: mirrors_lepy/local-multilingual-voice-translator

## Basic Information

- **Project Name**: local-multilingual-voice-translator
- **Description**: This project is a real-time, multilingual voice translator that leverages the power of local AI models for speech-to-text, translation, and text-to-speech. It is designed to be a powerful and flexible tool for anyone who needs to communicate across language barriers.
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-10-21
- **Last Updated**: 2025-11-02

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Real-Time Multilingual Voice Translator

This project is a real-time, multilingual voice translator that leverages the power of local AI models for speech-to-text, translation, and text-to-speech. It is designed to be a powerful and flexible tool for anyone who needs to communicate across language barriers.

## Demo

https://www.youtube.com/watch?v=2XjPpDcFOgQ

## Key Features

-   **Real-Time Translation**: Speak in any supported language and hear translations with natural voice synthesis.
-   **Multilingual Support**: Supports over 23 languages for both translation and voice synthesis.
-   **Local AI-Powered**: Utilizes local models, ensuring privacy and offline functionality.
-   **High-Quality Voice Synthesis**: Powered by Chatterbox TTS for natural-sounding voice output.
-   **Accurate Speech-to-Text**: Integrates with Distil-Whisper FastRTC for precise transcriptions.
-   **Web Interface**: User-friendly web interface built with Gradio and FastAPI.

## Supported Languages

| Code | Language   |
|:-----|:-----------|
| ar   | Arabic     |
| da   | Danish     |
| de   | German     |
| el   | Greek      |
| en   | English    |
| es   | Spanish    |
| fi   | Finnish    |
| fr   | French     |
| he   | Hebrew     |
| hi   | Hindi      |
| it   | Italian    |
| ja   | Japanese   |
| ko   | Korean     |
| ms   | Malay      |
| nl   | Dutch      |
| no   | Norwegian  |
| pl   | Polish     |
| pt   | Portuguese |
| ru   | Russian    |
| sv   | Swedish    |
| sw   | Swahili    |
| tr   | Turkish    |
| zh   | Chinese    |

## Installation Guide

Follow these steps precisely to ensure all dependencies are installed in the correct order.

1.  **Clone the Repository**
    ```bash
    git clone https://github.com/dwain-barnes/multilingual-voice-translator-realtime.git
    cd multilingual-voice-translator-realtime
    ```

2.  **Create and Activate a Conda Environment**
    
    We recommend using Anaconda to manage the environment, as shown in the setup video.
    ```bash
    # Create a new environment named 'translator' with Python 3.11
    conda create -n translator python=3.11 -y

    # Activate the new environment
    conda activate translator
    ```

3.  **Install Dependencies in Order**

    Run each of the following commands one by one. This specific order is crucial for the application to work correctly.
    ```bash
    # 1. Install core numerical and scientific libraries
    pip install numpy scipy

    # 2. Install the specific PyTorch version required by Chatterbox (with CUDA 11.8)
    # If you don't have an NVIDIA GPU, you can try removing "+cu118" and the --index-url
    pip install torch==2.0.0+cu118 torchaudio==2.0.0+cu118 --index-url https://download.pytorch.org/whl/cu118

    # 3. Install other requirements for TTS and audio processing
    pip install librosa transformers diffusers safetensors requests httpx pyaudio

    # 4. Install Chatterbox TTS
    pip install chatterbox-tts

    # 5. Install dependencies for the web server, real-time communication, and STT
    pip install python-dotenv
    pip install "fastrtc[vad,stt,tts]"
    pip install distil-whisper-fastrtc
    pip install openai
    pip install uvicorn
    ```

4.  **Set Up Local AI Models**
    -   **LM Studio**: Download and install LM Studio from [https://lmstudio.ai/](https://lmstudio.ai/).
        -   After installing, search for and download the **`Hunyuan-MT-7B-GGUF`** model from the in-app model browser.
        -   Once downloaded, navigate to the local server tab (`<->`) and load the model.
        -   Start the server.
    -   **Chatterbox & Whisper Models**: The required models for text-to-speech and speech-to-text will be downloaded automatically the first time you run the application.

5.  **Create Environment File**

    Create a file named `.env` in the root directory (`local-multilingual-voice-translator`) and add the following content:
    ```    LM_STUDIO_BASE_URL=http://localhost:1234/v1
    LM_STUDIO_API_KEY=lm-studio
    WHISPER_MODEL=distil-whisper/distil-large-v3
    ```

## How to Run

1.  Ensure your **LM Studio server is running** with the **`Hunyuan-MT-7B-GGUF`** model loaded.
2.  Make sure your `translator` conda environment is active in your terminal.
3.  Run the application from your terminal:
    ```bash
    python translator.py
    ```
4.  Open your web browser and navigate to **`http://127.0.0.1:7860`**.
5.  Select your source and target languages, and start translating!

## Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue if you have any feedback or suggestions.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.