{"release":{"tag":{"name":"v0.6.0","path":"/omniai/omniinfer/tags/v0.6.0","tree_path":"/omniai/omniinfer/tree/v0.6.0","message":"# v0.6.0\r\n\r\n## 核心特性\r\n\r\n* Omni Proxy\r\n* Omni Cache支持DSA\r\n* Omni Placement支持A2\r\n\r\n\r\n## 其它优化\r\n* 基于6P8-1D32@A3，平均3.5K+1K，Deepseek R1性能达到QPM600，TTFT\u003C2s，TPOT\u003C50ms\r\n* 基于18P8-1D144@A3，2K+2K，openPangu单卡Decode峰值性能达到2400 TPS，TPOT\u003C50ms\r\n\r\n## 支持模型列表\r\n\r\n| 模型| 硬件|精度类型|部署形态 |\r\n| --- | --- |--- |--- |\r\n| openPangu-Ultra-MoE-718B| A3|INT8|PD分离 |\r\n| openPangu-Ultra-MoE-718B| A2|INT8|PD分离 |\r\n| openPangu-38B| A3|INT8|混布 |\r\n| openPangu-38B| A2|INT8|混布 |\r\n| openPangu-7B| A3|BF16|混布 |\r\n| openPangu-7B| A2|BF16|混布 |\r\n| DeepSeek-R1| A3|INT8|PD分离 |\r\n| DeepSeek-R1| A3|W4A8C16|PD分离 |\r\n| DeepSeek-R1| A3|BF16|PD分离 |\r\n| DeepSeek-R1| A2|INT8|PD分离 |\r\n| DeepSeek-V3.1| A3|INT8|PD分离 |\r\n| DeepSeek-V3.2| A3|INT8|PD分离 |\r\n| Qwen2.5-7B |A3|INT8|混布（TP\u003E=1 DP=1） |\r\n| Qwen2.5-7B |A2|INT8|混布（TP\u003E=1 DP=1） |\r\n| QwQ |A3|BF16|PD分离 |\r\n| QwQ |A2|BF16|PD分离 |\r\n| Qwen3-235B| A3|INT8|PD分离 |\r\n| Qwen3-32B |A3|BF16|PD分离 |\r\n| Qwen3-30B| A3|BF16|PD分离 |\r\n| Kimi-K2| A3|W4A8C16|PD分离 |\r\n| Longcat-falsh| A3|BF16|PD分离 |\r\n| Ling-1T| A3|BF16|PD分离 |\r\n\r\n\r\n## 安装包\r\n| 硬件| 架构|镜像文件|Tar包 |\r\n| --- | --- |--- |--- |\r\n| A3| arm|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a3-arm:poc_v0.6.0-20251111-vllm|[omni_infer-a3-arm:v0.6.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.6.0/ARM/omni_infer-a3-arm-v0.6.0-20251111-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1793979714\u0026Signature=Iogz7c6bDvjL8gDcrg3y1ATIBM8%3D) |\r\n| A3| x86|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a3-x86:release_v0.6.0-20251111-vllm|[omni_infer-a3-x86:v0.6.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.6.0/X86/omni_infer-a3-x86-v0.6.0-20251111-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1793981006\u0026Signature=PmrViPlu6BuFpmJBcaZIVF21Vjw%3D)|\r\n| A2| arm|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a2-arm:release_v0.6.0-20251111-vllm|[omni_infer-a2-arm:v0.6.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.6.0/ARM/omni_infer-a2-arm-v0.6.0-20251111-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1793979683\u0026Signature=gGhQnZqw%2BaiuqjxrfT3P8bdj9%2Bg%3D)|\r\n| A2| x86|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a2-x86:release_v0.6.0-20251111-vllm|[omni_infer-a2-x86:v0.6.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.6.0/X86/omni_infer-a2-x86-v0.6.0-20251111-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1793981415\u0026Signature=RRAnzsgcp9PoWvmTK8Wpd3pvSxY%3D)|\r","commit":{"id":"ad124c3d9afb75a418a05787b46c55d9ee5e617b","short_id":"ad124c3","title":"!1206 [Fix] modify prefill/decode-lb-sdk for proxy","title_markdown":"\u003Ca title=\"Pull Request: [Fix] modify prefill/decode-lb-sdk for proxy\" class=\"gfm gfm-pull_request\" href=\"/omniai/omniinfer/pulls/1206\"\u003E!1206\u003C/a\u003E[Fix] modify prefill/decode-lb-sdk for proxy","description":"* modify prefill/decode-lb-sdk for proxy","description_markdown":"* modify prefill/decode-lb-sdk for proxy","message":"!1206 [Fix] modify prefill/decode-lb-sdk for proxy\n* modify prefill/decode-lb-sdk for proxy\n","message_markdown":"\u003Ca title=\"Pull Request: [Fix] modify prefill/decode-lb-sdk for proxy\" class=\"gfm gfm-pull_request\" href=\"/omniai/omniinfer/pulls/1206\"\u003E!1206\u003C/a\u003E\\[Fix\\] modify prefill/decode-lb-sdk for proxy\n* modify prefill/decode-lb-sdk for proxy","detail_path":"/omniai/omniinfer/commit/ad124c3d9afb75a418a05787b46c55d9ee5e617b","commits_path":"/omniai/omniinfer/commits/ad124c3d9afb75a418a05787b46c55d9ee5e617b","tree_path":"/omniai/omniinfer/tree/ad124c3d9afb75a418a05787b46c55d9ee5e617b","author":{"name":"陈凯","email":"chenkai243@huawei.com","username":"kai-chen-1104","user_path":"/kai-chen-1104","enterprise_user_path":"/omniai/dashboard/members/kai-chen-1104","image_path":"no_portrait.png#陈凯-kai-chen-1104","is_gitee_user":true,"is_enterprise_user":true,"widget_url":""},"committer":{"name":"liujianxin","email":"liujianxin@huawei.com","username":"octol","user_path":"/octol","enterprise_user_path":"/omniai/dashboard/members/octol","image_path":"no_portrait.png#liujianxin-octol","is_gitee_user":true,"is_enterprise_user":true,"widget_url":""},"authored_date":"2025-11-11T06:30:57+00:00","committed_date":"2025-11-11T06:30:57+00:00","signature":null,"build_state":null},"archive_path":"/omniai/omniinfer/repository/archive/v0.6.0","signature":null},"operating":{"edit":false,"download":true,"destroy":false,"enterprise_forbid_zip":false},"release":{"title":"Omni_infer v0.6.0 Release Note","path":"/omniai/omniinfer/releases/tag/v0.6.0","tag_path":"/omniai/omniinfer/tree/v0.6.0","project_id":41288219,"created_at":"2025-11-12T01:49:52+08:00","is_prerelease":false,"description":"# v0.6.0\r\n\r\n## 核心特性\r\n\r\n* Omni Proxy\r\n* Omni Cache支持DSA\r\n* Omni Placement支持A2\r\n\r\n\r\n## 其它优化\r\n* 基于7P8-1D32@A3，平均3.5K+1K，Deepseek R1性能达到QPM600，TTFT\u003C2s，TPOT\u003C50ms\r\n* 基于18P8-1D144@A3，2K+2K，openPangu-718B单卡Decode峰值性能达到2400 TPS，TPOT\u003C50ms\r\n\r\n## 支持模型列表\r\n\r\n| 模型| 硬件|精度类型|部署形态 |\r\n| --- | --- |--- |--- |\r\n| openPangu-Ultra-MoE-718B| A3|INT8|PD分离 |\r\n| openPangu-Ultra-MoE-718B| A2|INT8|PD分离 |\r\n| openPangu-38B| A3|INT8|混布 |\r\n| openPangu-38B| A2|INT8|混布 |\r\n| openPangu-7B| A3|BF16|混布 |\r\n| openPangu-7B| A2|BF16|混布 |\r\n| DeepSeek-R1| A3|INT8|PD分离 |\r\n| DeepSeek-R1| A3|W4A8C16|PD分离 |\r\n| DeepSeek-R1| A3|BF16|PD分离 |\r\n| DeepSeek-R1| A2|INT8|PD分离 |\r\n| DeepSeek-V3.1| A3|INT8|PD分离 |\r\n| DeepSeek-V3.2| A3|INT8|PD分离 |\r\n| Qwen2.5-7B |A3|INT8|混布（TP\u003E=1 DP=1） |\r\n| Qwen2.5-7B |A2|INT8|混布（TP\u003E=1 DP=1） |\r\n| QwQ |A3|BF16|PD分离 |\r\n| QwQ |A2|BF16|PD分离 |\r\n| Qwen3-235B| A3|INT8|PD分离 |\r\n| Qwen3-32B |A3|BF16|PD分离 |\r\n| Qwen3-30B| A3|BF16|PD分离 |\r\n| Kimi-K2| A3|W4A8C16|PD分离 |\r\n| Longcat-flash| A3|BF16|PD分离 |\r\n| Ling-1T| A3|BF16|PD分离 |\r\n\r\n\r\n## 安装包\r\n| 硬件| 架构|镜像文件|Tar包 |\r\n| --- | --- |--- |--- |\r\n| A3| arm| docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a3-arm:release_v0.6.0-20251111-vllm|[omni_infer-a3-arm:v0.6.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.6.0/ARM/omni_infer-a3-arm-v0.6.0-20251111-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1793979714\u0026Signature=Iogz7c6bDvjL8gDcrg3y1ATIBM8%3D) |\r\n| A3| x86|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a3-x86:release_v0.6.0-20251111-vllm|[omni_infer-a3-x86:v0.6.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.6.0/X86/omni_infer-a3-x86-v0.6.0-20251111-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1793981006\u0026Signature=PmrViPlu6BuFpmJBcaZIVF21Vjw%3D)|\r\n| A2| arm|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a2-arm:release_v0.6.0-20251111-vllm|[omni_infer-a2-arm:v0.6.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.6.0/ARM/omni_infer-a2-arm-v0.6.0-20251111-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1793979683\u0026Signature=gGhQnZqw%2BaiuqjxrfT3P8bdj9%2Bg%3D)|\r\n| A2| x86|docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a2-x86:release_v0.6.0-20251111-vllm|[omni_infer-a2-x86:v0.6.0_vllm](https://bucket-omni-infer-wuhu.obs.myhuaweicloud.com:443/DockerImage/Omni-Infer/Release/v0.6.0/X86/omni_infer-a2-x86-v0.6.0-20251111-vllm.tar?AccessKeyId=HPUABVUGOTP2OPODPEKP\u0026Expires=1793981415\u0026Signature=RRAnzsgcp9PoWvmTK8Wpd3pvSxY%3D)|\r\n","author":{"name":"liujianxin","username":"octol","path":"/octol","avatar_url":"no_portrait.png#liujianxin-octol"},"attach_files":[],"zip_download_url":"/omniai/omniinfer/releases/tag/v0.6.0.zip","tar_download_url":"/omniai/omniinfer/releases/tag/v0.6.0.tar.gz"}}}