代码拉取完成,页面将自动刷新
本样例为大家学习昇腾软件栈提供参考,非商业目的!
功能:使用Wenet模型实现语音(仅限中文语音)转文字。
样例输入:语音文件(必须是wav格式)。
样例输出:中文文本字符串。
本项目在Atlas 200I DK A2开发板中完成测试。请自行购买Atlas 200I DK A2,并根据官方文档,使用昇腾基础镜像,完成一键制卡。 本项目的依赖包都在基础镜像的base环境里面,用ssh登录dk后,base环境会自动激活。
获取源码包。
可以使用以下两种方式下载,请选择其中一种进行源码准备。
# 开发环境,以root用户命令行中执行以下命令下载源码仓。
cd ${HOME}
git clone https://gitee.com/ascend/ascend_community_projects.git
# 切换到310B分支
cd ascend_community_projects
git checkout 310B
# 1. ascend_community_projects仓左上角切换到310B分支,在点击右上角的 【克隆/下载】,选择 【下载ZIP】。
# 2. 将ZIP包上传到开发环境中的任意一个目录。
# 3. 开发环境中,执行以下命令,解压zip包。
cd ${HOME}
unzip ascend-samples-master.zip
获取此应用中所需要的原始网络模型并执行模型转换。
cd ${HOME}/ascend_community_projects/SpeechRecognition
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Atlas%20200I%20DK%20A2/DevKit/samples/23.0.RC1/base-samples/notebook-demo-datasets/10-speech-recognition/offline_encoder_sim.onnx
atc --model=offline_encoder_sim.onnx --framework=5 --output=offline_encoder --input_format=ND --input_shape="speech:1,1478,80;speech_lengths:1" --log=error --soc_version=Ascend310B1
# 转换后出现以下警告信息,不影响模型推理结果
# W11001: Op [/encoder/encoders.0/self_attn/Cast] does not hit the high-priority operator information library, which might result in compromised performance.
# W11001: Op [/Cast] does not hit the high-priority operator information library, which might result in compromised performance.
# W11001: Op [PartitionedCall_/ReduceSum_ReduceSum_670] does not hit the high-priority operator information library, which might result in compromised performance.
其中转换参数的含义为:
获取运行模型需要的配置文件和样例语音文件。
cd ${HOME}/ascend_community_projects/SpeechRecognition
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Atlas%20200I%20DK%20A2/DevKit/samples/23.0.RC1/base-samples/notebooks/10-speech-recognition/vocab.txt
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Atlas%20200I%20DK%20A2/DevKit/samples/23.0.RC1/base-samples/notebooks/10-speech-recognition/sample.wav
cd ${HOME}/ascend_community_projects/SpeechRecognition
python main.py
运行完成后,语音识别结果会打印在命令行回显中,如下所示:
[INFO] acl init success
[INFO] open device 0 success
[INFO] load model offline_encoder.om success
[INFO] create model description success
2023-06-13 01:52:22.283867: E external/org_tensorflow/tensorflow/core/framework/node_def_util.cc:675] NodeDef mentions attribute input_para_type_list which is not in the op definition: Op<name=Sum; signature=input:T, reduction_indices:Tidx -> output:T; attr=keep_dims:bool,default=false; attr=T:type,allowed=[DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, 6034766930529145842, DT_UINT16, DT_COMPLEX128, DT_HALF, DT_UINT32, DT_UINT64]; attr=Tidx:type,default=DT_INT32,allowed=[DT_INT32, DT_INT64]> This may be expected if your graph generating binary is newer than this binary. Unknown attributes will be ignored. NodeDef: {{node PartitionedCall_/ReduceSum_ReduceSum_670}}
智能语音作为智能时代人机交互的关键接口各行各业爆发式的场景需求驱动行业发展进入黄金期
[INFO] unload model success, model Id is 1
[INFO] end to destroy context
[INFO] end to reset device is 0
[INFO] end to finalize acl
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。