# funAsr_demo

**Repository Path**: yaohailu_admin/fun-asr_demo

## Basic Information

- **Project Name**: funAsr_demo
- **Description**: 本项目是使用ws连接funAsr模型的demo示例
- **Primary Language**: Java
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2025-05-29
- **Last Updated**: 2025-05-29

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# FunAsr部署记录
## 一、使用docker形式启动
### 1.获取镜像
```shell
docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6
```

### 2.启动并进入容器
+ linux系统等替换对应的数据卷位置
```shell
docker run -p 10095:10095 -it --privileged=true -v D:/big_model/funAsrOffline:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6


docker run -p 10095:10095 -it --privileged=true  -v D:/big_model/funAsrOffline:/workspace/models  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6

```

### 3.服务端启动

+ 启动
```shell
cd FunASR/runtime
nohup bash run_server.sh \
  --download-model-dir /workspace/models \
  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx  \
  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
  --itn-dir thuduj12/fst_itn_zh \
  --certfile 0 \
  --hotword /workspace/models/hotwords.txt > log.txt 2>&1 &
```

### 4.查看日志
```shell
tail -f /workspace/FunASR/runtime/log.txt
```

## 二、socket协议
+ 2.1 初始化发送示例
  + 在本人测试时发现,hotwords需要时字符串类型而不是obj类型
  + 参数说明
```
`mode`：`offline`，表示推理模式为离线文件转写
`wav_name`：表示需要推理音频文件名
`wav_format`：表示音视频文件后缀名，可选pcm、mp3、mp4等
`is_speaking`：False 表示断句尾点，例如，vad切割点，或者一条wav结束
`audio_fs`：当输入音频为pcm数据时，需要加上音频采样率参数
`hotwords`：如果使用热词，需要向服务端发送热词数据（字符串），格式为 "{"阿里巴巴":20,"通义实验室":30}"
`itn`: 设置是否使用itn，默认True
`svs_lang`: 设置SenseVoiceSmall模型语种，默认为“auto”
`svs_itn`: 设置SenseVoiceSmall模型是否开启标点、ITN，默认为True
```
```json
{"audio_fs":16000,"chunk_interval":10,"chunk_size":[5,10,5],"hotwords":"{\"阿里巴巴\":20,\"通义实验室\":30}","is_speaking":true,"itn":false,"mode":"offline","wav_format":"pcm","wav_name":"vad_example.wav"}

```

+ 2.2 发送二进制文件(分段)
+ 2.3 发送结束标识
```json
  {"is_speaking": false}
```
+ 2.4 接收响应参数
  + 参数说明
```
`mode`：`offline`，表示推理模式为离线文件转写
`wav_name`：表示需要推理音频文件名
`text`：表示语音识别输出文本
`is_final`：表示识别结束，在 offline 模式下这个字段永远为 False，服务端 websocket 只会返回一次识别结果
`timestamp`：如果AM为时间戳模型，会返回此字段，表示时间戳，格式为 "[[100,200], [200,500]]"(ms)
`stamp_sents`：如果AM为时间戳模型，会返回此字段，表示句子级别时间戳，格式为 [{"text_seg":"正 是 因 为","punc":",","start":430,"end":1130,"ts_list":[[430,670],[670,810],[810,1030],[1030,1130]]}]
```

```json
{"mode": "offline", "wav_name": "wav_name", "text": "asr ouputs", "is_final": True,"timestamp":"[[100,200], [200,500]]","stamp_sents":[]}

```