{"release":{"tag":{"name":"v0.7.0","path":"/omniai/omniinfer/tags/v0.7.0","tree_path":"/omniai/omniinfer/tree/v0.7.0","message":"# v0.7.0\r\n\r\n## 核心特性\r\n\r\n* Omni Cache支持MLA/GQA\r\n* chunk prefill混部入图\r\n* 支持SGLang\r\n\r\n\r\n## 其它优化\r\n* 基于2P8-1D32@A3，平均3.5K+1K，Deepseek R1性能达到QPM186，TTFT\u003C2s，TPOT\u003C20ms\r\n* 基于2P2-1D4@A3，2K+2K，openPangu-72B单卡Decode峰值性能达到1560 TPS，TPOT\u003C30ms\r\n\r\n## 支持模型列表\r\n\r\n| 模型| 硬件|精度类型|部署形态 |\r\n| --- | --- |--- |--- |\r\n| openPangu-Ultra-MoE-718B| A3|INT8|PD分离 |\r\n| openPangu-Ultra-MoE-718B| A2|INT8|PD分离 |\r\n| openPangu-72B| A3|INT8|PD分离 |\r\n| openPangu-38B| A3|INT8|混布 |\r\n| openPangu-38B| A2|INT8|混布 |\r\n| openPangu-7B| A3|BF16|混布 |\r\n| openPangu-7B| A2|BF16|混布 |\r\n| openPangu-7BVL| A3|BF16|混布 |\r\n| DeepSeek-R1| A3|INT8|PD分离 |\r\n| DeepSeek-R1| A3|W4A8C16|PD分离 |\r\n| DeepSeek-R1| A3|BF16|PD分离 |\r\n| DeepSeek-R1| A2|INT8|PD分离 |\r\n| DeepSeek-V3.1| A3|INT8|PD分离 |\r\n| DeepSeek-V3.2| A3|INT8|PD分离 |\r\n| DeepSeek-OCR| A2|BF16|混布 |\r\n| Qwen2.5-7B |A3|INT8|混布（TP\u003E=1 DP=1） |\r\n| Qwen2.5-7B |A2|INT8|混布（TP\u003E=1 DP=1） |\r\n| QwQ |A3|BF16|PD分离 |\r\n| QwQ |A2|BF16|PD分离 |\r\n| Qwen3-235B| A3|INT8|PD分离 |\r\n| Qwen3-235B| A2|BF16|PD分离 |\r\n| Qwen3-32B |A3|BF16|PD分离 |\r\n| Qwen3-32B |A3|INT8|PD分离 |\r\n| Qwen3-30B| A3|BF16|PD分离 |\r\n| Kimi-K2| A3|W4A8C16|PD分离 |\r\n| Kimi-K2 Thinking| A3|W4A8C16|PD分离 |\r\n| Longcat-flash| A3|BF16|PD分离 |\r\n| Ling-1T| A3|BF16|PD分离 |\r\n| GPT-OSS120B| A3|INT8|PD分离 |\r\n| GPT-OSS120B| A2|INT8|PD分离 |\r\n| GPT-OSS20B| A3|INT8|PD分离 |\r\n| GPT-OSS20B| A2|INT8|PD分离 |\r\n\r\n\r\n## 安装包\r\n| 硬件| 架构|镜像文件|Tar包 |\r\n| --- | --- |--- |--- |\r\n| A3| arm|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omniinfer-a3-arm:release_v0.7.0-vllm|[omni_infer-a3-arm:v0.7.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.7.0/ARM/omniinfer-a3-arm-release_v0.7.0-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1796120839\u0026Signature=wf8%2BL3H4RwLR09kA%2Bf3Er9qU0E4%3D) |\r\n| A3| x86|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omniinfer-a3-x86:release_v0.7.0-vllm|[omni_infer-a3-x86:v0.7.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.7.0/X86/omniinfer-a3-x86-release_v0.7.0-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1796198148\u0026Signature=x5F3J19Xll%2BXaemlHVf6JH5K/HI%3D)|\r\n| A2| arm|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omniinfer-a2-arm:release_v0.7.0-vllm|[omni_infer-a2-arm:v0.7.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.7.0/ARM/omniinfer-a2-arm-release_v0.7.0-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1796120800\u0026Signature=Tz3CCP1unm2hemkIB/afynKWsXw%3D)|\r\n| A2| x86|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omniinfer-a2-x86:release_v0.7.0-vllm|[omni_infer-a2-x86:v0.7.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.7.0/X86/omniinfer-a2-x86-release_v0.7.0-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1796198104\u0026Signature=Loyh98KqMPbVl3Wi4YYkUsGQ8Is%3D)|\r","commit":{"id":"b6762492432b218cea7fdbc4d6bd72f54182dd5b","short_id":"b676249","title":"!1616 bug fix - missing content in gpt-oss fc scenarios","title_markdown":"\u003Ca title=\"Pull Request: bug fix - missing content in gpt-oss fc scenarios\" class=\"gfm gfm-pull_request\" href=\"/omniai/omniinfer/pulls/1616\"\u003E!1616\u003C/a\u003Ebug fix - missing content in gpt-oss fc scenarios","description":"* bug fix - missing content in gpt-oss fc scenarios","description_markdown":"* bug fix - missing content in gpt-oss fc scenarios","message":"!1616 bug fix - missing content in gpt-oss fc scenarios\n* bug fix - missing content in gpt-oss fc scenarios\n","message_markdown":"\u003Ca title=\"Pull Request: bug fix - missing content in gpt-oss fc scenarios\" class=\"gfm gfm-pull_request\" href=\"/omniai/omniinfer/pulls/1616\"\u003E!1616\u003C/a\u003Ebug fix - missing content in gpt-oss fc scenarios\n* bug fix - missing content in gpt-oss fc scenarios","detail_path":"/omniai/omniinfer/commit/b6762492432b218cea7fdbc4d6bd72f54182dd5b","commits_path":"/omniai/omniinfer/commits/b6762492432b218cea7fdbc4d6bd72f54182dd5b","tree_path":"/omniai/omniinfer/tree/b6762492432b218cea7fdbc4d6bd72f54182dd5b","author":{"name":"ascend_msj","email":"mengsujia@huawei.com","username":"mmmlalala","user_path":"/mmmlalala","enterprise_user_path":"/omniai/dashboard/members/mmmlalala","image_path":"no_portrait.png#ascend_msj-mmmlalala","is_gitee_user":true,"is_enterprise_user":true,"widget_url":""},"committer":{"name":"liujianxin","email":"liujianxin@huawei.com","username":"octol","user_path":"/octol","enterprise_user_path":"/omniai/dashboard/members/octol","image_path":"no_portrait.png#liujianxin-octol","is_gitee_user":true,"is_enterprise_user":true,"widget_url":""},"authored_date":"2025-12-10T11:14:35+00:00","committed_date":"2025-12-10T11:14:35+00:00","signature":null,"build_state":null},"archive_path":"/omniai/omniinfer/repository/archive/v0.7.0","signature":null},"operating":{"edit":false,"download":true,"destroy":false,"enterprise_forbid_zip":false},"release":{"title":"Omni_infer v0.7.0 Release Note","path":"/omniai/omniinfer/releases/tag/v0.7.0","tag_path":"/omniai/omniinfer/tree/v0.7.0","project_id":41288219,"created_at":"2025-12-10T19:55:29+08:00","is_prerelease":false,"description":"# v0.7.0\r\n\r\n## 核心特性\r\n\r\n* Omni Cache支持MLA/GQA\r\n* chunk prefill混部入图\r\n* 支持SGLang\r\n\r\n\r\n## 其它优化\r\n* 基于2P8-1D32@A3，平均3.5K+1K，Deepseek R1性能达到QPM186，TTFT\u003C2s，TPOT\u003C20ms\r\n* 基于2P2-1D4@A3，2K+2K，openPangu-72B单卡Decode峰值性能达到1560 TPS，TPOT\u003C30ms\r\n\r\n## 支持模型列表\r\n\r\n| 模型| 硬件|精度类型|部署形态 |\r\n| --- | --- |--- |--- |\r\n| openPangu-Ultra-MoE-718B| A3|INT8|PD分离 |\r\n| openPangu-Ultra-MoE-718B| A2|INT8|PD分离 |\r\n| openPangu-72B| A3|INT8|PD分离 |\r\n| openPangu-38B| A3|INT8|混布 |\r\n| openPangu-38B| A2|INT8|混布 |\r\n| openPangu-7B| A3|BF16|混布 |\r\n| openPangu-7B| A2|BF16|混布 |\r\n| openPangu-7BVL| A3|BF16|混布 |\r\n| DeepSeek-R1| A3|INT8|PD分离 |\r\n| DeepSeek-R1| A3|W4A8C16|PD分离 |\r\n| DeepSeek-R1| A3|BF16|PD分离 |\r\n| DeepSeek-R1| A2|INT8|PD分离 |\r\n| DeepSeek-V3.1| A3|INT8|PD分离 |\r\n| DeepSeek-V3.2| A3|INT8|PD分离 |\r\n| DeepSeek-OCR| A2|BF16|混布 |\r\n| Qwen2.5-7B |A3|INT8|混布（TP\u003E=1 DP=1） |\r\n| Qwen2.5-7B |A2|INT8|混布（TP\u003E=1 DP=1） |\r\n| QwQ |A3|BF16|PD分离 |\r\n| QwQ |A2|BF16|PD分离 |\r\n| Qwen3-235B| A3|INT8|PD分离 |\r\n| Qwen3-235B| A2|BF16|PD分离 |\r\n| Qwen3-32B |A3|BF16|PD分离 |\r\n| Qwen3-32B |A3|INT8|PD分离 |\r\n| Qwen3-30B| A3|BF16|PD分离 |\r\n| Kimi-K2| A3|W4A8C16|PD分离 |\r\n| Kimi-K2 Thinking| A3|W4A8C16|PD分离 |\r\n| Longcat-flash| A3|BF16|PD分离 |\r\n| Ling-1T| A3|BF16|PD分离 |\r\n| GPT-OSS120B| A3|INT8|PD分离 |\r\n| GPT-OSS120B| A2|INT8|PD分离 |\r\n| GPT-OSS20B| A3|INT8|PD分离 |\r\n| GPT-OSS20B| A2|INT8|PD分离 |\r\n\r\n\r\n## 安装包\r\n| 硬件| 架构|镜像文件|Tar包 |\r\n| --- | --- |--- |--- |\r\n| A3| arm|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omniinfer-a3-arm:release_v0.7.0-vllm|[omni_infer-a3-arm:v0.7.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.7.0/ARM/omniinfer-a3-arm-release_v0.7.0-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1796120839\u0026Signature=wf8%2BL3H4RwLR09kA%2Bf3Er9qU0E4%3D) |\r\n| A3| x86|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omniinfer-a3-x86:release_v0.7.0-vllm|[omni_infer-a3-x86:v0.7.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.7.0/X86/omniinfer-a3-x86-release_v0.7.0-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1796198148\u0026Signature=x5F3J19Xll%2BXaemlHVf6JH5K/HI%3D)|\r\n| A2| arm|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omniinfer-a2-arm:release_v0.7.0-vllm|[omni_infer-a2-arm:v0.7.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.7.0/ARM/omniinfer-a2-arm-release_v0.7.0-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1796120800\u0026Signature=Tz3CCP1unm2hemkIB/afynKWsXw%3D)|\r\n| A2| x86|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omniinfer-a2-x86:release_v0.7.0-vllm|[omni_infer-a2-x86:v0.7.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.7.0/X86/omniinfer-a2-x86-release_v0.7.0-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1796198104\u0026Signature=Loyh98KqMPbVl3Wi4YYkUsGQ8Is%3D)|\r\n","author":{"name":"liujianxin","username":"octol","path":"/octol","avatar_url":"no_portrait.png#liujianxin-octol"},"attach_files":[],"zip_download_url":"/omniai/omniinfer/releases/tag/v0.7.0.zip","tar_download_url":"/omniai/omniinfer/releases/tag/v0.7.0.tar.gz"}}}