# CosyVoice **Repository Path**: Zdevote/cosy-voice ## Basic Information - **Project Name**: CosyVoice - **Description**: 基于来源CosyVoice 的 声音复刻以及合成 - **Primary Language**: JavaScript - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-04-30 - **Last Updated**: 2026-04-30 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # CosyVoice Demo This is a small, no-dependency demo for Alibaba Cloud Bailian CosyVoice voice cloning and voice design. It includes: - Voice cloning form with consent confirmation and audio URL validation. - Voice design form that returns `voice_id` and preview audio when the API provides it. - Query, list, and delete helpers for custom voices. - HTTP speech synthesis with a returned browser-playable audio URL. ## Run ```bash cp .env.example .env # edit .env and set DASHSCOPE_API_KEY npm run dev ``` Open: ```text http://localhost:5177 ``` ## Notes - The demo keeps `DASHSCOPE_API_KEY` on the local server and never sends it to the browser. - Sample audio should be a clear human voice, normally 10-20 seconds, up to 60 seconds, WAV/MP3/M4A, no larger than 10 MB, with at least 16 kHz sample rate. - `cosyvoice-v3.5-plus` and `cosyvoice-v3.5-flash` are limited to the China mainland Beijing region according to the current Alibaba Cloud documentation. - CosyVoice speech synthesis can use the non-realtime HTTP API for a simple demo. The optional WebSocket path remains available by sending `transport: "websocket"`, but the UI defaults to HTTP because it is easier to debug for single-click preview. - Add a product-level authorization flow before any production use. Voice cloning must only be used with explicit permission from the voice owner. ## API Surface ```text GET /api/health POST /api/cosyvoice/clone POST /api/cosyvoice/design GET /api/cosyvoice/:voiceId GET /api/cosyvoice?prefix=demo DELETE /api/cosyvoice/:voiceId POST /api/cosyvoice/synthesize-preview ``` The REST proxy uses the official voice customization endpoint: ```text POST /api/v1/services/audio/tts/customization ``` The upstream request body uses `model: "voice-enrollment"` and passes the selected CosyVoice synthesis model as `input.target_model`. Do not send `X-DashScope-Async: enable` to this endpoint. Some accounts and customization APIs reject asynchronous calls with `current user api does not support asynchronous calls`. The synthesis proxy uses: ```text POST /api/v1/services/audio/tts/SpeechSynthesizer ``` If the API returns `[cosyvoice]Engine return error code: 418`, first verify that the selected `model` exactly matches the `target_model` used when the custom `voice_id` was created. Custom voice IDs commonly start with that model name, for example `cosyvoice-v3-plus-...`.