diff --git a/.gitee/PULL_REQUEST_TEMPLATE.en.md b/.gitee/PULL_REQUEST_TEMPLATE.en.md index c62f68d2b82f2608c02fc94acd92e66188629bc7..52764caf805f40f63f0c77efe1fe038a222dbac2 100644 --- a/.gitee/PULL_REQUEST_TEMPLATE.en.md +++ b/.gitee/PULL_REQUEST_TEMPLATE.en.md @@ -1,6 +1,6 @@ diff --git a/.gitee/PULL_REQUEST_TEMPLATE.md b/.gitee/PULL_REQUEST_TEMPLATE.md index c62f68d2b82f2608c02fc94acd92e66188629bc7..52764caf805f40f63f0c77efe1fe038a222dbac2 100644 --- a/.gitee/PULL_REQUEST_TEMPLATE.md +++ b/.gitee/PULL_REQUEST_TEMPLATE.md @@ -1,6 +1,6 @@ diff --git a/.gitee/PULL_REQUEST_TEMPLATE.zh-CN.md b/.gitee/PULL_REQUEST_TEMPLATE.zh-CN.md index 8636fdcfc96f20bd5050218ba64c4b4cbdd16669..1d55805cf51706d5575810c72c86597fcc9eefed 100644 --- a/.gitee/PULL_REQUEST_TEMPLATE.zh-CN.md +++ b/.gitee/PULL_REQUEST_TEMPLATE.zh-CN.md @@ -1,6 +1,6 @@ diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 2e4e68b249d23adba15e9b5d578d663b25d64127..d314c75bcdbea0616089aa89aa0b83c8de3521ed 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -23,14 +23,14 @@ For individual contributor, please refer to [ICLA online document](https://www.m ## Getting Started -- Fork the repository on [Github](https://github.com/mindspore-ai/mindspore) or [Gitee](https://gitee.com/mindspore/mindspore). -- Read the [README.md](README.md) and [install page](https://www.mindspore.cn/install/en) for project information and build instructions. +- Fork the repository on [Gitee](https://gitee.com/mindspore/mindspore). +- Read the [README.md](README.md) and [install page](https://www.mindspore.cn/lite/docs/en/master/use/downloads.html) for installation package and usage instructions. ## Contribution Workflow ### Code style -Please follow this style to make MindSpore easy to review, maintain and develop. +Please follow this style to make MindSpore Lite easy to review, maintain and develop. - Coding guidelines @@ -64,12 +64,9 @@ Please follow this style to make MindSpore easy to review, maintain and develop. If you want to download the code to the local machine, `git` is the best way: ```shell - # For GitHub - git clone https://github.com/{insert_your_forked_repo}/mindspore.git - git remote add upstream https://github.com/mindspore-ai/mindspore.git # For Gitee - git clone https://gitee.com/{insert_your_forked_repo}/mindspore.git - git remote add upstream https://gitee.com/mindspore/mindspore.git + git clone https://gitee.com/{insert_your_forked_repo}/mindspore-lite.git + git remote add upstream https://gitee.com/mindspore/mindspore-lite.git ``` - Develop code locally @@ -95,9 +92,9 @@ Please follow this style to make MindSpore easy to review, maintain and develop. git push origin {new_branch_name} ``` -- Pull a request to MindSpore repository +- Pull a request to MindSpore Lite repository - In the last step, your need to pull a compare request between your new branch and MindSpore `master` branch. After finishing the pull request, the Jenkins CI will be automatically set up for building test. Your pull request should be merged into the upstream master branch as soon as possible to reduce the risk of merging. + In the last step, your need to pull a compare request between your new branch and MindSpore Lite `master` branch. After finishing the pull request, the Jenkins CI will be automatically set up for building test. Your pull request should be merged into the upstream master branch as soon as possible to reduce the risk of merging. ### Report issues @@ -105,7 +102,7 @@ A great way to contribute to the project is to send a detailed report when you e When reporting issues, refer to this format: -- What version of env (mindspore, os, python etc) are you using? +- What version of env (MindSpore Lite, os, python etc) are you using? - Is this a BUG REPORT or FEATURE REQUEST? - What kind of issue is, add the labels to highlight it on the issue dashboard. - What happened? @@ -122,7 +119,7 @@ When reporting issues, refer to this format: ### Propose PRs -- Raise your idea as an *issue* on [GitHub](https://github.com/mindspore-ai/mindspore/issues) or [Gitee](https://gitee.com/mindspore/mindspore/issues) +- Raise your idea as an *issue* on [Gitee](https://gitee.com/mindspore/mindspore-lite/issues) - If it is a new feature that needs lots of design details, a design proposal should also be submitted. - After reaching consensus in the issue discussions and design proposal reviews, complete the development on the forked repo and submit a PR. - None of PRs is not permitted until it receives **2+ LGTM** from approvers. Please NOTICE that approver is NOT allowed to add *LGTM* on his own PR. diff --git a/CONTRIBUTING_CN.md b/CONTRIBUTING_CN.md index dcd1a14e23d028083329def8301404848af3ac6a..77f809719fbe2223b3644ec12cf63001de2a7c4b 100644 --- a/CONTRIBUTING_CN.md +++ b/CONTRIBUTING_CN.md @@ -24,14 +24,14 @@ ## 快速入门 -- 在[Github](https://github.com/mindspore-ai/mindspore)或[Gitee](https://gitee.com/mindspore/mindspore)上fork代码仓。 -- 参见[README_CN.md](README_CN.md)和[安装页面](https://www.mindspore.cn/install)了解项目信息和构建说明。 +- 在[Gitee](https://gitee.com/mindspore/mindspore-lite)上fork代码仓。 +- 参见[README_CN.md](README_CN.md)和[下载页面](https://www.mindspore.cn/lite/docs/zh-CN/master/use/downloads.html)了解安装包与使用说明。 ## 贡献流程 ### 代码风格 -请遵循此风格,以便MindSpore审查、维护和开发。 +请遵循此风格,以便MindSpore Lite审查、维护和开发。 - 编码指南 @@ -65,12 +65,9 @@ 如果您想将代码下载到本地计算机,最好使用git方法: ```shell - # 在GitHub上: - git clone https://github.com/{insert_your_forked_repo}/mindspore.git - git remote add upstream https://github.com/mindspore-ai/mindspore.git # 在Gitee上: - git clone https://gitee.com/{insert_your_forked_repo}/mindspore.git - git remote add upstream https://gitee.com/mindspore/mindspore.git + git clone https://gitee.com/{insert_your_forked_repo}/mindspore-lite.git + git remote add upstream https://gitee.com/mindspore/mindspore-lite.git ``` - 本地开发代码。 @@ -96,9 +93,9 @@ git push origin {新分支名称} ``` -- 将请求拉取到MindSpore代码仓。 +- 将请求拉取到MindSpore Lite代码仓。 - 在最后一步中,您需要在新分支和MindSpore主分支之间拉取比较请求。完成拉取请求后,Jenkins CI将自动设置,进行构建测试。拉取请求应该尽快合并到上游master分支中,以降低合并的风险。 + 在最后一步中,您需要在新分支和MindSpore Lite主分支之间拉取比较请求。完成拉取请求后,Jenkins CI将自动设置,进行构建测试。拉取请求应该尽快合并到上游master分支中,以降低合并的风险。 ### 报告Issue @@ -106,7 +103,7 @@ 报告issue时,请参考以下格式: -- 说明您使用的环境版本(MindSpore、OS、Python等)。 +- 说明您使用的环境版本(MindSpore Lite、OS、Python等)。 - 说明是错误报告还是功能需求。 - 说明issue类型,添加标签可以在issue板上突出显示该issue。 - 问题是什么? @@ -123,7 +120,7 @@ ### 提交PR -- 在[GitHub](https://github.com/mindspore-ai/mindspore/issues)或[Gitee](https://gitee.com/mindspore/mindspore/issues)上通过issue提出您的想法。 +- 在[Gitee](https://gitee.com/mindspore/mindspore-lite/issues)上通过issue提出您的想法。 - 如果是需要大量设计细节的新功能,还应提交设计方案。 - 经issue讨论和设计方案评审达成共识后,在已fork的代码仓开发,并提交PR。 - 任何PR至少需要位2位审批人的LGTM标签。请注意,审批人不允许在自己的PR上添加LGTM标签。 diff --git a/NOTICE b/NOTICE index 676462131c7ed9c632b68e3bbedecee3bf0c03bc..2ff20b8b7411e64041cc2df6c2681d45d1f67793 100644 --- a/NOTICE +++ b/NOTICE @@ -1,2 +1,2 @@ -MindSpore -Copyright 2019-2020 Huawei Technologies Co., Ltd +MindSpore Lite +Copyright 2019-2025 Huawei Technologies Co., Ltd diff --git a/README.md b/README.md index 3bb5e3d0cca7dcd4790617d486ecd8d3aedfeaa1..a9101d90b33decc0a4248c5166999c8ccf1d5bf7 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ ## What Is MindSpore Lite -MindSpore lite is a high-performance, lightweight open source reasoning framework that can be used to meet the needs of AI applications on mobile devices. MindSpore Lite focuses on how to deploy AI technology more effectively on devices. It has been integrated into HMS (Huawei Mobile Services) to provide inferences for applications such as image classification, object detection and OCR. MindSpore Lite will promote the development and enrichment of the AI software/hardware application ecosystem. +MindSpore Lite provides lightweight AI inference acceleration capabilities for different hardware devices, enabling intelligent applications and providing end-to-end solutions for developers. It offers development friendly, efficient, and flexible deployment experiences for algorithm engineers and data scientists, helping the AI software and hardware application ecosystem thrive. In the future, MindSpore Lite will work with the MindSpore AI community to enrich the AI software and hardware application ecosystem. MindSpore Lite Architecture @@ -10,22 +10,21 @@ For more details please check out our [MindSpore Lite Architecture Guide](https: ### MindSpore Lite features -1. Cooperative work with MindSpore training - - Provides training, optimization, and deployment. +1. Terminal and Cloud one-stop inference deployment + - Provide end-to-end processes for model transformation optimization, deployment, and inference. - The unified IR realizes the device-cloud AI application integration. 2. Lightweight - Provides model compress, which could help to improve performance as well. - - Provides the ultra-lightweight reasoning solution MindSpore Micro to meet the deployment requirements in extreme environments such as smart watches and headphones. + - Provides the ultra-lightweight reasoning solution MindSpore Lite Micro to meet the deployment requirements in extreme environments such as smart watches and headphones. 3. High-performance - - The built-in high-performance kernel computing library NNACL supports multiple convolution optimization algorithms such as Slide window, im2col+gemm, winograde, etc. + - The built-in high-performance kernel computing library NNACL supports high-performance inference for dedicated chips such as CPU, NNRt, and Ascend, maximizing hardware computing power while minimizing inference latency and power consumption. - Assembly code to improve performance of kernel operators. Supports CPU, GPU, and NPU. + 4. Versatility - - Supports IOS, Android. - - Supports Lite OS. - - Supports mobile device, smart screen, pad, and IOT devices. - - Supports third party models such as TFLite, CAFFE and ONNX. + - Support deployment of multiple hardware such as server-side Ascend and CPU. + - Supports HarmonyOS and Android mobile operating systems. ## MindSpore Lite AI deployment procedure @@ -33,7 +32,7 @@ For more details please check out our [MindSpore Lite Architecture Guide](https: Select a new model or use an existing model for incremental training using labeled data. When designing a model for mobile device, it is necessary to consider the model size, accuracy and calculation amount. - The MindSpore team provides a series of pre-training models used for image classification, object detection. You can use these pre-trained models in your application. + The MindSpore Lite team provides a series of pre-training models used for image classification, object detection. You can use these pre-trained models in your application. The pre-trained model provided by MindSpore: [Image Classification](https://download.mindspore.cn/model_zoo/official/lite/). More models will be provided in the feature. @@ -43,7 +42,7 @@ For more details please check out our [MindSpore Lite Architecture Guide](https: If you use MindSpore or a third-party model, you need to use [MindSpore Lite Model Converter Tool](https://www.mindspore.cn/lite/docs/en/master/converter/converter_tool.html) to convert the model into MindSpore Lite model. The MindSpore Lite model converter tool provides the converter of TensorFlow Lite, Caffe, ONNX to MindSpore Lite model, fusion and quantization could be introduced during convert procedure. - MindSpore also provides a tool to convert models running on IoT devices . + MindSpore Lite also provides a tool to convert models running on IoT devices . 3. Model deployment @@ -53,16 +52,4 @@ For more details please check out our [MindSpore Lite Architecture Guide](https: Load the model and perform inference. [Inference](https://www.mindspore.cn/lite/docs/en/master/infer/runtime_cpp.html) is the process of running input data through the model to get output. - MindSpore provides pre-trained model that can be deployed on mobile device [example](https://www.mindspore.cn/lite/examples/en). - -## MindSpore Lite benchmark test result - -We test a couple of networks on HUAWEI Mate40 (Hisilicon Kirin9000e) mobile phone, and get the test results below for your reference. - -| NetWork | Thread Number | Average Run Time(ms) | -| ------------------- | ------------- | -------------------- | -| basic_squeezenet | 4 | 6.415 | -| inception_v3 | 4 | 36.767 | -| mobilenet_v1_10_224 | 4 | 4.936 | -| mobilenet_v2_10_224 | 4 | 3.644 | -| resnet_v2_50 | 4 | 25.071 | + MindSpore Lite provides pre-trained model that can be deployed on mobile device [example](https://www.mindspore.cn/lite/examples/en). diff --git a/README_CN.md b/README_CN.md index 15b305475a0c345a3ec07cfe14aa3bc4ac08c8fb..32fd121d49d090483df773742d9697e49aafdb3a 100644 --- a/README_CN.md +++ b/README_CN.md @@ -3,7 +3,7 @@ ## MindSpore Lite介绍 -MindSpore Lite是MindSpore推出的端云协同的、轻量化、高性能AI推理框架,用于满足越来越多的端测AI应用需求。MindSpore Lite聚焦AI技术在端侧设备上的部署和运行,已经在华为HMS和智能终端的图像分类、目标识别、人脸识别、文字识别等应用中广泛使用,未来MindSpore Lite将与MindSpore AI社区一起,致力于丰富AI软硬件应用生态。 +MindSpore Lite面向不同硬件设备提供轻量化AI推理加速能力,使能智能应用,为开发者提供端到端的解决方案,为算法工程师和数据科学家提供开发友好、运行高效、部署灵活的体验,帮助人工智能软硬件应用生态繁荣发展,未来MindSpore Lite将与MindSpore AI社区一起,致力于丰富AI软硬件应用生态。 MindSpore Lite Architecture @@ -11,27 +11,25 @@ MindSpore Lite是MindSpore推出的端云协同的、轻量化、高性能AI推 ## MindSpore Lite技术特点 -1. 端云协同提供一站式训练和推理 +1. 端云一站式推理部署 - - 提供模型训练、模型转换优化、部署和推理端到端流程。 + - 提供模型转换优化、部署和推理端到端流程。 - 统一的IR实现端云AI应用一体化。 2. 超轻量 - 支持模型量化压缩,模型更小跑得更快。 - - 提供超轻量的推理解决方案MindSpore Micro,满足智能手表、耳机等极限环境下的部署要求。 + - 提供超轻量的推理解决方案MindSpore Lite Micro,满足智能手表、耳机等极限环境下的部署要求。 -3. 高性能 +3. 高性能推理 - - 自带的高性能内核计算库NNACL,支持Sliding Windows、Im2Col+GEMM、Winograd等多种卷积优化算法。 + - 自带的高性能内核计算库NNACL,支持CPU、NNRt、Ascend等专用芯片高性能推理,最大化发挥硬件算力,最小化推理时延和功耗。 - 汇编级优化,支持CPU、GPU、NPU异构调度,最大化发挥硬件算力,最小化推理时延和功耗。 4. 广覆盖 - - 支持iOS、Android等手机操作系统。 - - 支持LiteOS嵌入式操作系统。 - - 支持手机、大屏、平板、IoT等各种智能设备上的AI应用。 - - 支持MindSpore/TensorFlow Lite/Caffe/ONNX模型,方便用户快速部署。 + - 支持服务端Ascend、CPU等多硬件部署。 + - 支持鸿蒙、Android手机操作系统。 ## MindSpore Lite AI部署流程 @@ -39,7 +37,7 @@ MindSpore Lite是MindSpore推出的端云协同的、轻量化、高性能AI推 包括选择新模型或对已有模型,利用标注数据进行增量训练。面向端侧设计模型时,需要考虑模型大小、精度和计算量。 - MindSpore团队提供了一系列预训练模型,用于解决图像分类、目标检测等场景的学习问题。可以在您的应用程序中使用这些预训练模型对应的终端模型。 + MindSpore Lite团队提供了一系列预训练模型,用于解决图像分类、目标检测等场景的学习问题。可以在您的应用程序中使用这些预训练模型对应的终端模型。 MindSpore提供的预训练模型:[图像分类(Image Classification)](https://download.mindspore.cn/model_zoo/official/lite/)。后续MindSpore团队会增加更多的预置模型。 @@ -49,7 +47,7 @@ MindSpore Lite是MindSpore推出的端云协同的、轻量化、高性能AI推 如果您使用MindSpore或第三方训练的模型,需要使用[MindSpore Lite模型转换工具](https://www.mindspore.cn/lite/docs/zh-CN/master/converter/converter_tool.html)转换成MindSpore Lite模型格式。MindSpore Lite模型转换工具不仅提供了将TensorFlow Lite、Caffe、ONNX等模型格式转换为MindSpore Lite模型格式,还提供了算子融合、量化等功能。 - MindSpore还提供了将IoT设备上运行的模型转换成.C代码的生成工具。 + MindSpore Lite还提供了将IoT设备上运行的模型转换成.C代码的生成工具。 经过上述两个部署,您已经得到端侧可以部署的模型。 @@ -61,16 +59,4 @@ MindSpore Lite是MindSpore推出的端云协同的、轻量化、高性能AI推 主要完成模型推理工作,即加载模型,完成模型相关的所有计算。[推理](https://www.mindspore.cn/lite/docs/zh-CN/master/infer/runtime_cpp.html)是通过模型运行输入数据,获取预测的过程。 - MindSpore提供了预训练模型部署在智能终端的[样例](https://www.mindspore.cn/lite/examples)。 - -## MindSpore Lite性能参考数据 - -我们在HUAWEI Mate40(Hisilicon Kirin9000e)手机上,测试了一组端侧常见网络的性能数据,供您参考: - -| 网络 | 线程数 | 平均推理时间(毫秒) | -| ------------------- | ----- | --------------- | -| basic_squeezenet | 4 | 6.415 | -| inception_v3 | 4 | 36.767 | -| mobilenet_v1_10_224 | 4 | 4.936 | -| mobilenet_v2_10_224 | 4 | 3.644 | -| resnet_v2_50 | 4 | 25.071 | + MindSpore Lite提供了预训练模型部署在智能终端的[样例](https://www.mindspore.cn/lite/examples)。 diff --git a/RELEASE.md b/RELEASE.md index 17442c7365ecde242865a42802ffa788590eb685..28e5541c644cc3440f91139c9dc679c39c3d0f40 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1,6472 +1,747 @@ -# MindSpore Release Notes +# MindSpore Lite Release Notes [查看中文](./RELEASE_CN.md) -## MindSpore 2.6.0 Release Notes +## MindSpore Lite 2.7.0 Release Notes ### Major Features and Improvements -#### Dataset - -- [STABLE] The sharding sampling behavior of the [MindDataset](https://www.mindspore.cn/docs/en/master/api_python/dataset/mindspore.dataset.MindDataset.html) interface has been changed from block-based sampling (Data sharding strategy 2 in the link) to interval sampling (Data sharding strategy 1 in the link). Users can control whether to switch back to block-based sampling by setting the MS_DEV_MINDRECORD_SHARD_BLOCK environment variable. -- [STABLE] GeneratorDataset supports spawn to start multiprocessing, and supports the use of Ascend back-end data augmentation methods in multiprocessing. Users can set [mindspore.dataset.config.set_multiprocessing_start_method("spawn")](https://www.mindspore.cn/docs/en/master/api_python/dataset/mindspore.dataset.config.set_multiprocessing_start_method.html) to enable multiprocessing in spawn mode. -- [STABLE] The `shuffle` parameter in [MindDataset](https://www.mindspore.cn/docs/en/master/api_python/dataset/mindspore.dataset.MindDataset.html) supports the `Shuffle.ADAPTIVE`option, which adaptively adjusts the shuffle sample count strategy based on the number of samples to reduce training memory overhead and lower the risk of OOM. If global shuffle is desired, users can specify `Shuffle.GLOBAL`, but they must ensure sufficient machine memory. - -#### Ascend - -- [STABLE] In MindSpore's dynamic graph mode, the AscendC custom operators integrated by the [ops.Custom](https://www.mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.Custom.html) primitive support multiple output types, and `ops.Custom` supports type inference on the C++ side. -- [BETA] In MindSpore's dynamic graph mode, added `CustomOpBuilder` to support online building and loading of custom operators. -- [STABLE] When using the O1 compilation option, users can control the scope of graph and computation fusion optimization. Users can enable or disable specific fusion patterns by setting the environment variable MS_DEV_GRAPH_KERNEL_FLAGS with options such as enable_fusion_pattern_only or disable_fusion_pattern. Additionally, it supports reading configuration from a file via the --path=example.json option. -- [STABLE] Support users to set the aclop operator cache information aging configuration and error message reporting mode configuration through the [mindspore.device_context.ascend.op_debug.aclinit_config](https://www.mindspore.cn/docs/en/master/api_python/device_context/mindspore.device_context.ascend.op_debug.aclinit_config.html) interface. -- [STABLE] GE backend only supports whole graph sinking and lazy inline subgraph sinking, while other scenarios are no longer supported. -- [BETA] In MindSpore's static graph O0/O1 mode, `mindpore.nn.Cell` adds the new interface `offload` and the attribute `backward_prefetch`. Users can use this interface through [Cell.offload(backward_prefetch)](https://www.mindspore.cn/docs/en/master/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.offload) to offload activations within a specific `Cell` class from the device side to the host side during the forward training phase, and prefetch activations from the host side to the device side during the backward training phase. - -#### Parallel - -- [STABLE] Parallel pdb debugging, dynamic and static graph mode are supported. dynamic graph mode is recommended. -- [STABLE] New API [mindspore.communication.get_comm_name](https://www.mindspore.cn/docs/en/master/api_python/communication/mindspore.communication.get_comm_name.html), which allows users to query the name of the underlying communicator of the HCCL collection communication library. -- [STABLE] Added [AutoParallel](https://www.mindspore.cn/docs/en/master/api_python/parallel/mindspore.parallel.auto_parallel.AutoParallel.html) API to support parallel configuration of individual networks, solving the problem of excessive scope of parallel configuration. -- [STABLE] SeqPipe now supports two new scheduling methods, seqvpp and seqsmartvpp, significantly reducing the memory cost in scenarios where SeqPipe is combined with VPP. -- [STABLE] Static graph now supports zero2/zero3 level memory optimization, reducing the memory cost for models that require pure data parallel (DP) training. -- [STABLE] Static graph now supports 1b1f compute and communication overlapping in pipeline parallelism conditions, enhancing the performance of pipeline parallelism. -- [STABLE] Static graphs support grad model parallel communication overlap with dw computation under tensor model parallelism and expert model parallelism, improving model training performance. -- [STABLE] Static graph auto-parallel strategy propagation mode is updated to prioritize the layout propagation to improve the accuracy. -- [STABLE] Static graph auto-parallel support using [mindspore.parallel.shard](https://www.mindspore.cn/docs/en/master/api_python/parallel/mindspore.parallel.shard.html) interface to configure strategies for mint operators, optimized for multi-input operators. -- [STABLE] For LLM reinforcement learning, now we support DP/MP/PP for training and inferenceing phase. -- [STABLE] MindSpore supports users to query whether the distributed module is available and whether the communication module is initialized. Users can query whether the distributed module is available through the [mint.distributed.is_available](https://www.mindspore.cn/docs/en/master/api_python/mint/mindspore.mint.distributed.is_available.html) interface, and query whether the communication module is initialized through the [mint.distributed.is_initialized](https://www.mindspore.cn/docs/en/master/api_python/mint/mindspore.mint.distributed.is_initialized.html) interface. -- [STABLE] MindSpore static graph mode supports the `AlltoAllV` forward and reverse operators. Users can use this operator through the [ops.AlltoAllV](https://www.mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.AlltoAllV.html) interface. -- [STABLE] Support CPU operators [mindspore.mint.distributed.allreduce](https://www.mindspore.cn/docs/en/master/api_python/mint/mindspore.mint.distributed.all_reduce.html#mindspore.mint.distributed.all_reduce), [mindspore.mint.distributed.barrier](https://www.mindspore.cn/docs/en/master/api_python/mint/mindspore.mint.distributed.barrier.html#mindspore.mint.distributed.barrier), [mindspore.mint.distributed.send](https://www.mindspore.cn/docs/en/master/api_python/mint/mindspore.mint.distributed.send.html#mindspore.mint.distributed.send), and [mindspore.mint.distributed.recv](https://www.mindspore.cn/docs/en/master/api_python/mint/mindspore.mint.distributed.recv.html#mindspore.mint.distributed.recv), and the users can use the corresponding aggregate communication operator functions through these interfaces. - -#### Inference - -- [STABLE] Support full-precision inference with BFloat16 and quantized inference with W8A8 for DeepSeek-V3/R1. Add or optimize 12 fusion operators including RmsNormQuant, MatMul+Sigmoid+Add, and Transpose+BatchMatMul+Transpose to enhance the inference performance of DeepSeek-V3/R1. -- [BETA] Support deploying inference services of DeepSeek-V3/R1 using MindIE and MindSpore Transformers large model development suite. -- [STABLE] Optimize the process of loading safetensors and realize on-demand initialization of GE, which reduces both memory usage and startup time when deploying inference services using MindIE and MindSpore Transformers large model suite. -- [BETA] Support deploying inference services of DeepSeek-V3/R1 and Qwen2.5 using the [vLLM-MindSpore](https://gitee.com/mindspore/vllm-mindspore) plugin and vLLM v0.6.6.post1. - -#### Profiler - -- [STABLE] The MindSpore framework supports obtaining communication domain parallel strategy information, which can be visualized to improve performance troubleshooting efficiency in cluster scenarios. -- [STABLE] MindSpore Profiler dynamic profiling supports lightweight instrumentation, allowing users to dynamically enable lightweight tracing and view performance data in real time. -- [STABLE] MindSpore Profiler's lightweight instrumentation capability has been enhanced, supporting key phases such as dataloader and save checkpoint with lightweight tracing information. -- [STABLE] Profiler supports viewing memory_access related aicore metric information. -- [STABLE] MindSpore Profiler supports [mindspore.profiler.profile](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.profiler.profile.html) and [_ExperimentalConfig](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.profiler._ExperimentalConfig.html), as well as the [tensorboard_trace_handler](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.profiler.tensorboard_trace_handler.html) parameter, improving tool usability. -- [STABLE] MindSpore Profiler dynamic profiling now supports memory data collection, allowing users to dynamically enable memory data gathering to enhance tool usability. - -#### Compiler - -- [BETA] The graph mode supports the inplace and view operator forward expression capabilities. +- [STABLE] MindSpore Lite supports configuring operator parallel inference acceleration during model conversion. You only need to configure the stream_label_file option during model conversion to specify the operators that need parallel inference. +- [STABLE] MindSpore Lite supports the conversion of onnx if operators in the Ascend backend. ### API Change -#### New APIs & Enhanced APIs - -- [DEMO] [mindspore.mint](https://www.mindspore.cn/docs/en/master/api_python/mindspore.mint.html) API provides more functional, nn interfaces. The mint interface is currently an experimental interface and performs better than ops in `jit_level="O0"` and pynative mode. Currently, the graph sinking mode and CPU/GPU backend are not supported, and it will be gradually improved in the future. - - | mindspore.mint | - | :------------------------------ | - | mindspore.mint.reshape | - | mindspore.mint.triangular_solve | - | mindspore.mint.index_add | - | mindspore.mint.logaddexp2 | - | mindspore.mint.diag | - - | mindspore.mint.nn | - | :----------------------------- | - | mindspore.mint.nn.Sigmoid | - | mindspore.mint.nn.Conv2d | - | mindspore.mint.nn.PixelShuffle | - - | mindspore.mint.nn.functional | - | :----------------------------------------------- | - | mindspore.mint.nn.functional.adaptive_avg_pool3d | - | mindspore.mint.nn.functional.conv2d | - | mindspore.mint.nn.functional.avg_pool3d | - | mindspore.mint.nn.functional.elu_ | - | mindspore.mint.nn.functional.pixel_shuffle | - - | others | - | ------------------------ | - | mindspore.mint.optim.SGD | - | mindspore.mint.linalg.qr | - -- [STABLE] [mindspore.mint](https://www.mindspore.cn/docs/en/master/api_python/mindspore.mint.html) API also provides some new stable interfaces. Besides, some demo interfaces are changed into stable ones. - - | mindspore.mint | - | :----------------------- | - | mindspore.mint.full_like | - | mindspore.mint.log2 | - | mindspore.mint.isneginf | - - | mindspore.mint.nn | - | :-------------------------- | - | mindspore.mint.nn.GLU | - | mindspore.mint.nn.KLDivLoss | - - | mindspore.mint.nn.functional | - | :---------------------------------- | - | mindspore.mint.nn.functional.glu | - | mindspore.mint.nn.functional.kl_div | - - | mindspore.Tensor | - | :------------------------ | - | mindspore.Tensor.isneginf | - | mindspore.Tensor.log2 | - -- [DEMO] [mindspore.Tensor](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor) API provides more Tensor methods. Currently, these Tensor methods are experimental interfaces and currently does not support the graph sink mode and CPU, GPU backend, and they will be gradually improved in the future. Details can be found in [API list](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor) in official website. -- [STABLE] [mindspore.ops](https://www.mindspore.cn/docs/en/master/api_python/mindspore.ops.html) provides two inference API [mindspore.ops.moe_token_permute](https://www.mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.moe_token_permute.html#mindspore.ops.moe_token_permute) and [mindspore.ops.moe_token_unpermute](https://www.mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.moe_token_unpermute.html#mindspore.ops.moe_token_unpermute). Currently, only Ascend backend is supported. -- [STABLE] [mindspore.mint.nn.functional.gelu](https://www.mindspore.cn/docs/en/master/api_python/mint/mindspore.mint.nn.functional.gelu.html) and [mindspore.mint.nn.GeLU](https://www.mindspore.cn/docs/en/master/api_python/mint/mindspore.mint.nn.GELU.html) now support input argument "approximate". -- [STABLE] Added the offline parsing interface [mindspore.profiler.profiler.analyse](https://gitee.com/link?target=https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.profiler.profiler.analyse.html). - -#### Backwards Incompatible Change - -- For [mindspore.ops.Xlogy](https://www.mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.Xlogy.html), the arguments `input` and `other` no longer support non-tensor input. [(!81625) - ](https://gitee.com/mindspore/mindspore/pulls/81625) - - - - - - - - - -
2.5.0 2.6.0
-  ops.Xlogy(input [Tensor, numbers.Number, bool],
-            other [Tensor, numbers.Number, bool])
-  
-  ops.Xlogy(input [Tensor], other [Tensor])
-  
- -- `&` operator no longer supports the input Tensor with data type of uint32/uint64 on Ascend backend in PyNative mode. `^` operator no longer supports the input Tensor with data type of uint16/uint32/uint64 on Ascend backend in PyNative mode. `|` operator no longer supports the input Tensor with data type of uint16/uint32/uint64 on Ascend backend in PyNative mode at the scene of `tensor | scalar`. [(!82054)](https://gitee.com/mindspore/mindspore/pulls/82054) -- `%` operator no longer supports the input Tensor with data type of uint16/uint32/uint64 on CPU and GPU backend. [(!83055)](https://gitee.com/mindspore/mindspore/pulls/83055) -- [mindspore.jit](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.jit.html) interface parameter change。[(!80248)](https://gitee.com/mindspore/mindspore/pulls/80248) - - The name of parameter `fn` is changed to `function` . - - Remove parameter `mode` , `input_signature` , `hash_args` , `jit_config` and `compile_once` . - - Add parameter `capture_mode` to set how to compile to MindSpore graph. - - - - - - - - - -
2.5.0 2.6.0
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(mode="PIJit")
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(capture_mode="bytecode")
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
- - Add parameter `jit_level` to set the level of compilation optimization. - - - - - - - - - -
2.5.0 2.6.0
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit, JitConfig
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(jit_config=JitConfig(jit_level="O0"))
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(jit_level="O0")
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
- - Add parameter `dynamic` to set whether dynamic shape compilation should be performed. - - - - - - - - - -
2.5.0 2.6.0
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(dynamic=1)
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
- - Add parameter `fullgraph` to set whether to capture the entire function into graph. - - - - - - - - - -
2.5.0 2.6.0
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit, JitConfig
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(jit_config=JitConfig(jit_syntax_level="STRICT"))
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(fullgraph=True)
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
- - Add parameter `backend` to set the compilation backend to be used. - - - - - - - - - -
2.5.0 2.6.0
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(backend="ms_backend")
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
- - Add parameter `options` to set the dictionary of options to pass to the compilation backend. - - - - - - - - - -
2.5.0 2.6.0
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit, JitConfig
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(jit_config=JitConfig(infer_boost="on"))
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
-    >>> import numpy as np
-    >>> from mindspore import Tensor, jit
-    >>>
-    >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-    >>>
-    >>> @jit(infer_boost="on")
-    ... def tensor_add_with_dec(x, y):
-    ...     z = x + y
-    ...     return z
-    ...
-    >>> out = tensor_add_with_dec(x, y)
-    
-
- -- The `mindspore.profiler.tensor_board_trace_handler` interface change. - - The `mindspore.profiler.tensor_board_trace_handler` interface is now renamed to [mindspore.profiler.tensorboard_trace_handler](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.profiler.tensorboard_trace_handler.html). - - - - - - - - - -
2.5.0 2.6.0
-  >>> from mindspore.profiler import tensor_board_trace_handler
-  
-
-  >>> from mindspore.profiler import tensorboard_trace_handler
-  
-
- -- The `mindspore.set_context` interface change。 - - The `exception_dump` field in the `ascend_config` parameter was changed to the `"dump"` field in [device_context.ascend.op_debug.aclinit_config](https://www.mindspore.cn/docs/en/master/api_python/device_context/mindspore.device_context.ascend.op_debug.aclinit_config.html). - - - - - - - - - -
2.5.0 2.6.0
-  >>> import mindspore as ms
-  >>> ms.set_context(ascend_config = {"exception_dump": "2"})
-  
-
-  >>> import mindspore as ms
-  >>> ms.device_context.ascend.op_debug.aclinit_config({"dump": {"dump_scene": "lite_exception"}})
-  
-
- -- The printing content of `mindspore.Tensor` change。 - - The original Tensor prints only the value, while the new Tensor prints key information such as shape and dtype. - - - - - - - - - -
2.5.0 2.6.0
-  >>> import mindspore as ms
-  >>> tensor = ms.Tensor([1,1,1], dtype=ms.float32)
-  >>> print(tensor)
-  [1. 1. 1.]
-  
-
-  >>> import mindspore as ms
-  >>> tensor = ms.Tensor([1,1,1], dtype=ms.float32)
-  >>> print(tensor)
-  Tensor(shape=[3], dtype=Float32, value= [ 1.00000000e+00,  1.00000000e+00,  1.00000000e+00])
-  
-
- -- In graph mode, Ascend backend, when jit_level is O2, the Dump interface changes. - - In the graph Ascend backend jit_level O2 scenario, the environment variables `MINDSPORE_DUMP_CONFIG` and `ENABLE_MS_GE_DUMP` have been deprecated, and the dump-related functions have been migrated to the msprobe tool. For more details, please refer to [msprobe Tool MindSpore Scene Accuracy Data Collection Guide](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md). +- [STABLE] In the acl model conversion configuration, a new stream_label_file option is added under the ascend_context option to enable multi-stream parallel inference. ### Contributors -amyMaYun,Ava,baishanyang,br_fix_save_strategy_ckpt,caifubi,caoxubo,cccc1111,ccsszz,chaijinwei,chaiyouheng,changzherui,chengbin,chopupu,chujinjin,congcong,dairenjie,DavidFFFan,DeshiChen,dingjinshan,fary86,fengqiang,fengyixing,ffmh,fuhouyu,Gallium,gaoshuanglong,gaoyong10,geyuhong,guoyq16,guoyuzhe,GuoZhibin,guozhijian,gupengcheng0401,hangq,Hanshize,haozhang,hedongdong,hhz886,HighCloud,horcham,huangbingjian,huangxiang360729,huangzhichao2023,huangzhuo,huangziling,huda,Huilan Li,hujiahui8,huoxinyou,jiangchao_j,jiangchenglin3,jiangshanfeng,jiaorui,jiaxueyu,jizewei,jjfeing,JoeyLin,jshawjc,kairui_kou,kakyo82,kisnwang,leida,lianghongrui,LiangZhibo,LiaoTao_Wave,lichen,limingqi107,LiNuohang,linux,litingyu,liubuyu,liuchuting,liuluobin,liuyanwei,LLLRT,looop5,luochao60,luojianing,luoxuewei,luoyang,lyk,maoyuanpeng1,Margaret_wangrui,mengxian,MengXiangyu,mylinchi,NaCN,panzhihui,pengqi,PingqiLi,pipecat,qiuyufeng,qiuzhongya,qiwenlun,r1chardf1d0,rachel0858,rainyhorse,Rudy_tan,shaoshengqi,shen_haochen,shenhaojing,shenwei41,shiro-zzz,shuqian0,stavewu,TAJh,tanghuikang,tangmengcheng,tongdabao,TuDouNi,VectorSL,wang_ziqi,wangjie,wangliaohui97,wangpingan,wangyibo,weiyang,wja,wudongkun,wujueying,wuweikang,wwwbby,xfan233,XianglongZeng,xiaopeng,xiaotianci,xiaoyao,xiedejin1,XinDu,xuxinglei,xuzhen,xuzixiang,yang guodong,yangben,yanghaoran,yangruoqi713,yangzhenzhang,yanx,Yanzhi_YI,yao_yf,yide12,yihangchen,YijieChen,yonibaehr,Youmi,yuanqi,yuchaojie,yuezenglin,Yuheng Wang,YuJianfeng,YukioZzz,ZeyuHan,zhaiyukun,Zhang QR,zhangbuxue,zhangdanyang,zhangshucheng,zhangyinxia,ZhangZGC,zhangzhen,zhengzuohe,zhouyaqiang0,zichun_ye,zlq2020,zong_shuai,ZPaC,zyli2020,舛扬,范吉斌,冯一航,胡犇,胡彬,宦晓玲,简云超,李栋,李良灿,李林杰,李寅杰3,刘思铭,刘勇琪,刘子涵,梅飞要,任新,十一雷,孙昊辰,王泓皓,王禹程,王振邦,熊攀,俞涵,虞良斌,云骑士,张栩浩,赵文璇,周一航 +熊攀,ZhangZGC,yanghaoran,李林杰,shenwei41,xiaotianci,panzhihui,guozhijian,胡彬,tangmengcheng,XianglongZeng,cccc1111,stavewu,刘思铭,r1chardf1d0,jiangshanfeng -## MindSpore Lite 2.6.0 Release Notes +## MindSpore Lite 2.3.1 Release Notes ### Major Features and Improvements -- [STABLE] MindSpore Lite supports configuring operator parallel inference acceleration during model conversion. You only need to configure the stream_label_file option during model conversion to specify the operators that need parallel inference. -- [STABLE] MindSpore Lite supports the conversion of onnx if operators in the Ascend backend. +When converting Ascend backend models, the [input_shape](https://www.mindspore.cn/lite/docs/en/r2.3.1/use/cloud_infer/converter_tool_ascend.html) parameter in the configuration file is supported to specify the input size. ### API Change -- [STABLE] In the acl model conversion configuration, a new stream_label_file option is added under the ascend_context option to enable multi-stream parallel inference. +- [ModelGroup](https://www.mindspore.cn/lite/docs/en/r2.3.1/use/cloud_infer/runtime_cpp.html) interface adds model weight sharing support to save video memory. + +- [Model.get_model_info](https://www.mindspore.cn/lite/docs/en/r2.3.1/use/converter_tool.html?highlight=get_model_info) interface adds support for obtaining the input size of the model. ### Contributors -熊攀,ZhangZGC,yanghaoran,李林杰,shenwei41,xiaotianci,panzhihui,guozhijian,胡彬,tangmengcheng,XianglongZeng,cccc1111,stavewu,刘思铭,r1chardf1d0,jiangshanfeng +熊攀;ZhangZGC;jxl;zhangyanhui;emmmmtang;huandong1;yefeng -## MindSpore 2.5.0 Release Notes +## MindSpore Lite 2.3.0-rc2 Release Notes ### Major Features and Improvements -#### Distributed Startup Component msrun +- [STABLE] Support the configuration of FlashAttention related properties in the configuration file used by the cloud-side conversion tool. +- [STABLE] Support multi-devices memory sharing. -- [STABLE] msrun supports passing in the hostname of the node (e.g. localhost) as `-master_addr`, which improves the ease of use of msrun. -- [STABLE] msrun supports printing training logs to standard output. Users can control which ranks to print with the `-tail_worker_log` parameter. -- [STABLE] After setting `export VLOG_v=12500`, the `scheduler` log can output cluster information, which helps users to quickly count cluster data. -- [STABLE] msrun supports formatting the log file name with the `--worker_log_name` parameter to help users quickly locate the problem node. +### Contributors -For details, refer to [msrun Launching](https://www.mindspore.cn/docs/en/r2.5.0/model_train/parallel/msrun_launcher.html). +Thanks goes to these wonderful people: -#### Profiler +emmmmtang,熊攀 -- [STABLE] New interfaces [mindspore.profiler.schedule](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mindspore/mindspore.profiler.schedule.html) and [mindspore.profiler.tensor_board_trace_handler](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mindspore/mindspore.profiler.tensor_board_trace_handler.html) are added to support the acquisition and rendering of PyNative scenarios by step, which improves the ease of use of PyNative scenarios. -- [STABLE] Dynamic Profiling supports customized for loops to enhance the ease of use of dynamic graphical scenarios. -- [STABLE] Profiler initialization parameters and deliverables directory structure aligned to PTA to reduce user migration difficulty. -- [STABLE] A new lightweight interface, [mindspore.profiler.mstx](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mindspore/mindspore.profiler.mstx.html), has been added to provide users with a low-overhead performance data collection method. -- [STABLE] Timeline supports displaying hardware utilization data to help users locate downclocking issues. +Contributions of any kind are welcome! -For details, refer to [Ascend Performance Tuning](https://www.mindspore.cn/docs/en/r2.5.0/model_train/optimize/profiler.html). +## MindSpore Lite 2.2.11 Release Notes -#### PyNative +### Bug Fixes -- [Beta] PyNative mode supports the inplace operator process, introduces the inplace operators. Taking the [mindspore.mint.nn.functional.relu](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mint/mindspore.mint.nn.functional.relu.html) as an example, if you want to use the inplace updated version of the relu operator, you can call the [mindspore.mint.nn.functional.relu_](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mint/mindspore.mint.nn.functional.relu_.html). -- [STABLE] Enable the environment variable MS_SIMULATION_LEVEL=1 to enable the PyNative dryrun, the multi-device process can be simulated without occupying the device, and the display memory usage can be viewed in logs. For details, refer to [Environment Variables](https://www.mindspore.cn/docs/en/r2.5.0/api_python/env_var_list.html?highlight=ms_simulation_level#%E5%88%86%E5%B8%83%E5%BC%8F%E5%B9%B6%E8%A1%8C)。 +- [#I8TPLY] Fixed SSD MobileNetV2 FPN network inference error on Atlas inference series products. -#### FrontEnd +### Contributors -- [STABLE] Added [mindspore.nn.utils.no_init_parameters](https://www.mindspore.cn/docs/en/r2.5.0/api_python/nn/mindspore.nn.utils.no_init_parameters.html) API, which supports delayed initialization of network parameters and reduces model startup time in inference scenarios. +Thanks goes to these wonderful people: -### API Change +wangtongyu6, zhuguodong, 徐永飞, 徐安越, yeyunpeng2020, moran, XinDu, gengdongjie. -#### New APIs - -- [DEMO] [mindspore.mint](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mindspore.mint.html) API provides more functional, nn interfaces. The mint interface is currently an experimental interface and performs better than ops in O0/O1 and pynative mode. Currently, the O2 compilation mode (graph sinking mode) and CPU/GPU backend are not supported, and it will be gradually improved in the future. - - | mindspore.mint | | | | - | :-------------------------- | :------------------------ | :-------------------------------- | :------------------------- | - | mindspore.mint.bernoulli | mindspore.mint.bincount | mindspore.mint.clone | mindspore.mint.einsum | - | mindspore.mint.empty | mindspore.mint.empty_like | mindspore.mint.full_like | mindspore.mint.randint | - | mindspore.mint.randint_like | mindspore.mint.randn | mindspore.mint.randn_like | mindspore.mint.randperm | - | mindspore.mint.chunk | mindspore.mint.concat | mindspore.mint.count_nonzero | mindspore.mint.scatter | - | mindspore.mint.select | mindspore.mint.squeeze | mindspore.mint.swapaxes | mindspore.mint.transpose | - | mindspore.mint.triu | mindspore.mint.unbind | mindspore.mint.unique_consecutive | mindspore.mint.multinomial | - | mindspore.mint.addmv | mindspore.mint.diff | mindspore.mint.exp2 | mindspore.mint.float_power | - | mindspore.mint.fix | mindspore.mint.fmod | mindspore.mint.frac | mindspore.mint.lerp | - | mindspore.mint.log2 | mindspore.mint.log10 | mindspore.mint.logaddexp | mindspore.mint.mv | - | mindspore.mint.nansum | mindspore.mint.nan_to_num | mindspore.mint.polar | mindspore.mint.ravel | - | mindspore.mint.outer | mindspore.mint.softmax | mindspore.mint.t | mindspore.mint.cdist | - | mindspore.mint.amax | mindspore.mint.amin | mindspore.mint.cumprod | mindspore.mint.histc | - | mindspore.mint.logsumexp | mindspore.mint.norm | mindspore.mint.std | mindspore.mint.std_mean | - | mindspore.mint.var | mindspore.mint.var_mean | mindspore.mint.allclose | mindspore.mint.argsort | - | mindspore.mint.equal | mindspore.mint.isinf | mindspore.mint.isneginf | mindspore.mint.not_equal | - | mindspore.mint.addbmm | mindspore.mint.addmm | mindspore.mint.baddbmm | mindspore.mint.dot | - | mindspore.mint.meshgrid | mindspore.mint.mm | | | - - | mindspore.mint.nn | | - | :---------------------------------- | ---------------------------------- | - | mindspore.mint.nn.Conv3d | mindspore.mint.nn.ConstantPad1d | - | mindspore.mint.nn.ConvTranspose2d | mindspore.mint.nn.ConstantPad2d | - | mindspore.mint.nn.BatchNorm1d | mindspore.mint.nn.ConstantPad3d | - | mindspore.mint.nn.BatchNorm2d | mindspore.mint.nn.ReflectionPad1d | - | mindspore.mint.nn.BatchNorm3d | mindspore.mint.nn.ReflectionPad2d | - | mindspore.mint.nn.LayerNorm | mindspore.mint.nn.ReflectionPad3d | - | mindspore.mint.nn.SyncBatchNorm | mindspore.mint.nn.ReplicationPad1d | - | mindspore.mint.nn.ELU | mindspore.mint.nn.ZeroPad1d | - | mindspore.mint.nn.GELU | mindspore.mint.nn.ZeroPad2d | - | mindspore.mint.nn.LogSigmoid | mindspore.mint.nn.ZeroPad3d | - | mindspore.mint.nn.ReLU6 | mindspore.mint.nn.BCELoss | - | mindspore.mint.nn.SiLU | mindspore.mint.nn.CrossEntropyLoss | - | mindspore.mint.nn.Tanh | mindspore.mint.nn.NLLLoss | - | mindspore.mint.nn.Embedding | mindspore.mint.nn.SmoothL1Loss | - | mindspore.mint.nn.Dropout2d | mindspore.mint.nn.Upsample | - | mindspore.mint.nn.AdaptiveAvgPool1d | mindspore.mint.nn.MaxUnpool2d | - | mindspore.mint.nn.AdaptiveAvgPool2d | | - - | mindspore.mint.nn.functional | - | :----------------------------------------------- | - | mindspore.mint.nn.functional.adaptive_avg_pool1d | - | mindspore.mint.nn.functional.adaptive_avg_pool2d | - | mindspore.mint.nn.functional.avg_pool1d | - | mindspore.mint.nn.functional.max_unpool2d | - | mindspore.mint.nn.functional.logsigmoid | - | mindspore.mint.nn.functional.relu6 | - | mindspore.mint.nn.functional.relu_ | - | mindspore.mint.nn.functional.normalize | - | mindspore.mint.nn.functional.dropout2d | - | mindspore.mint.nn.functional.nll_loss | - | mindspore.mint.nn.functional.smooth_l1_loss | - | mindspore.mint.nn.functional.interpolate | - | mindspore.mint.nn.functional.conv3d | - - | mindspore.mint.distributed | | - | ------------------------------------------------- | -------------------------------------------------- | - | mindspore.mint.distributed.all_gather | mindspore.mint.distributed.get_global_rank | - | mindspore.mint.distributed.all_gather_into_tensor | mindspore.mint.distributed.get_group_rank | - | mindspore.mint.distributed.all_gather_object | mindspore.mint.distributed.get_process_group_ranks | - | mindspore.mint.distributed.all_reduce | mindspore.mint.distributed.init_process_group | - | mindspore.mint.distributed.all_to_all | mindspore.mint.distributed.irecv | - | mindspore.mint.distributed.all_to_all_single | mindspore.mint.distributed.isend | - | mindspore.mint.distributed.barrier | mindspore.mint.distributed.new_group | - | mindspore.mint.distributed.batch_isend_irecv | mindspore.mint.distributed.P2POp | - | mindspore.mint.distributed.broadcast | mindspore.mint.distributed.recv | - | mindspore.mint.distributed.broadcast_object_list | mindspore.mint.distributed.reduce | - | mindspore.mint.distributed.gather | mindspore.mint.distributed.reduce_scatter | - | mindspore.mint.distributed.gather_object | mindspore.mint.distributed.reduce_scatter_tensor | - | mindspore.mint.distributed.get_backend | mindspore.mint.distributed.scatter | - | mindspore.mint.distributed.scatter_object_list | mindspore.mint.distributed.send | - - | others | - | --------------------------------- | - | mindspore.mint.optim.Adam | - | mindspore.mint.linalg.matrix_norm | - | mindspore.mint.linalg.norm | - | mindspore.mint.linalg.vector_norm | - | mindspore.mint.special.exp2 | - -- [STABLE] Two inference API [mindspore.ops.incre_flash_attention](https://www.mindspore.cn/docs/en/r2.5.0/api_python/ops/mindspore.ops.incre_flash_attention.html) and [mindspore.ops.prompt_flash_attention](https://www.mindspore.cn/docs/en/r2.5.0/api_python/ops/mindspore.ops.prompt_flash_attention.html) are added. Currently, only Ascend backend is supported. -- [STABLE] [mindspore.runtime](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mindspore.runtime.html) replaces the original mindspore.hal interfaces and provides interfaces related to runtime resources such as stream, memory, and event. -- [STABLE] [mindspore.device_context](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mindspore.device_context.html) replaces some parameters of the original set_context interface and provides setting interfaces related to hardware platform. -- [DEMO] [mindspore.Tensor](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor) API provides more Tensor methods. Currently the Tensor interfaces are still the experimental interfaces, and do not support the graph sink mode and CPU, GPU back-end, will be gradually improved. In addition, a large number of existing Tensor methods, including operators like +=, -=, *= and /=, have been adapted with Aclnn kernels on the Ascend backend through overloading. Details can be found in [API list](https://www.mindspore.cn/docs/en/r2.5.0/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor) in official website. - -#### Backwards Incompatible Change - -- For API [mindspore.Tensor.new_ones](https://gitee.com/link?target=https://www.mindspore.cn/docs/en/r2.5.0/api_python/mindspore/Tensor/mindspore.Tensor.new_zeros.html), the input argument "size" no longer supports the data type of Tensor. -- mindspore.Profiler removes timeline_limit, rank_id, analyze_only, env_enable parameters. - -- Interface name: mindspore.Profiler - - Changes: profile_communication is deprecated and the communication matrix data is collected by setting profiler_level=ProfilerLevel.Level1 or profiler_level=ProfilerLevel.Level2. - - Clarification: The default value of profiler_level is ProfilerLevel.Level0. - - - - - - - - - -
original interface v2.5.0 interface
-  Profiler(profile_communication=True)
-  
-
-  Profiler(profiler_level=ProfilerLevel.Level1) or Profiler(profiler_level=ProfilerLevel.Level2)
-  
-
- -- Interface name: mindspore.Profiler - - Changes: op_time is deprecated, set to collect NPU-side operator performance data by setting activaties=[mindspore.profiler.ProfilerActivity.NPU]. - - Clarification: The activaties parameter is of type list, and as long as it contains the mindspore.profiler.ProfilerActivity.NPU parameter, it means that the collection of NPU-side operator performance data is enabled, and the collection is turned on by default. - - - - - - - - - -
original interface v2.5.0 interface
-  Profiler(op_time=True)
-  
-
-  Profiler(activaties=[mindspore.profiler.ProfilerActivity.NPU])
-  
-
- -- Interface name: mindspore.Profiler - - Changes: The type of aicore_metrics changed from int to mindspore.profiler.AicoreMetrics enum value. - - Clarification: The default value of aicore_metrics is mindspore.profiler.AicoreMetric.AiCoreNone. - - - - - - - - - -
original interface v2.5.0 interface
-  Profiler(aicore_metrics=0)
-  
-
-  Profiler(aicore_metrics=mindspore.profiler.AicoreMetric.AiCoreNone)
-  
-
- -- Interface name: mindspore.Profiler - - Changes: profile_framework is deprecated and the frame measurements are collected by setting activaties=[mindspore.profiler.ProfilerActivity.CPU]. - - Clarification: activaties parameter type is list, as long as it contains mindspore.profiler.ProfilerActivity.CPU parameter means enable to collect the framework performance data, the default on the collection. - - - - - - - - - -
original interface v2.5.0 interface
-  Profiler(profile_framework="all")
-  
-
-  Profiler(activaties=[mindspore.profiler.ProfilerActivity.CPU])
-  
-
+Contributions of any kind are welcome! -### Contributors +## MindSpore Lite 2.2.10 Release Notes -baishanyang ,bantao ,Bellatan ,biangelin ,BigSkySea ,caifubi ,candanzg ,candyhong ,Carey ,cccc1111 ,chaijinwei ,changzherui ,chengbin ,chengfeng27 ,chengxb7532 ,chujinjin ,coder2237 ,czrz ,dairenjie ,DavidFFFan ,DeshiChen ,dingjinshan ,ehaleva ,Erpim ,fary86 ,fengyixing ,ffmh ,fuchao ,fuhouyu ,gaoyong10 ,geyuhong ,guoyuzhe ,GuoZhibin ,guozhijian ,halo ,hangq ,haozhang ,hedongdong ,hehongzhe ,hhz886 ,HighCloud ,huangbingjian ,HuangLe02 ,huangziling ,huda ,Huilan Li ,hujiahui8 ,jiahaochen666 ,jiangchao_j ,jiangchenglin3 ,jiangshanfeng ,jiaorui ,jiaxueyu ,jizewei ,jjfeing ,JoeyLin ,jshawjc ,kakyo82 ,kingxian ,kisnwang ,leida ,liangchenghui ,lianghongrui ,LiangZhibo ,lichen ,limingqi107 ,LINH ,linux ,lionelchang ,lishanni ,liubuyu ,liujunzhu ,liuluobin ,liuxu ,liuyanwei ,liyan2022 ,LLLRT ,looop5 ,luochao60 ,luoxuewei ,luoyang ,lyk ,machenggui ,maoyuanpeng1 ,Margaret_wangrui ,master,mengxian ,MengXiangyu ,mengyuanli ,Mrtutu ,mylinchi ,NaCN ,Nikanuo ,niujunhao ,panzhihui ,pengqi ,PingqiLi ,pipecat ,qiuleilei ,qiuyufeng ,qiuzhongya ,r1chardf1d0 ,shaoshengqi ,shen_haochen ,shenhaojing ,shenwei41 ,shilishan ,shiro-zzz ,shuqian0 ,St.Universe ,stavewu ,superxf ,suteng ,TAJh ,tanghuikang ,tangmengcheng ,tan-wei-cheng ,tianxiaodong ,TuDouNi ,TYWZ22259 ,user_0145 ,VectorSL ,vincen45 ,wang_ziqi ,wangshaocong ,wangwensheng4 ,weiyang ,wtcheng ,wtobill ,wujiangming ,wujueying ,wuweikang ,wwwbby ,XianglongZeng ,xiaopeng ,xiaotianci ,xiaoyao ,xiedejin1 ,XinDu ,xuxinglei ,yang guodong ,yangben ,yanghaoran ,yanglong ,yanx ,Yanzhi_YI ,yao_yf ,yefeng ,Yi_zhang95 ,yide12 ,yihangchen ,YijieChen ,YingtongHu ,ylw ,yonibaehr ,yuanqi ,yuchaojie ,yuezenglin ,YuJianfeng ,yyuse ,Zhang QR ,zhangbuxue ,zhangdanyang ,zhanghaibo ,zhangminli ,zhangyinxia ,ZhangZGC ,zhangzhen ,zhengzuohe ,zhouyaqiang0 ,zhuguodong ,zichun_ye ,zlq2020 ,zong_shuai ,ZPaC ,zyli2020 ,陈一 ,程超 ,冯一航 ,胡彬 ,宦晓玲 ,黄勇 ,简云超 ,康伟 ,李栋 ,李良灿 ,李林杰 ,李寅杰,刘崇鸣 ,刘力力 ,刘思铭 ,刘涛Liu ,刘勇琪 ,刘子涵 ,吕浩宇 ,吕凯盟 ,梅飞要 ,倪轩 ,任新 ,十一雷 ,孙昊辰 ,王禹程 ,王振邦 ,熊攀 ,俞涵 ,虞良斌 ,张栩浩 ,赵文璇 ,周莉莉 ,周一航 ,邹文祥 +### Bug Fixes -## MindSpore 2.4.1 Release Notes +- [#I8K7CC] Optimize error message when non-string segments are passed to get_model_info. -### Major Features and Improvements +### Contributors -#### AutoParallel +Thanks goes to these wonderful people: -- [STABLE] Split/concat branch communication computation parallel is supported. Users split input data to form parallelizable branches. Automatic communication computing parallelism is performed between branches, reducing communication overhead. -- [STABLE] Sequence pipelines are supported. The LLama series models for the dev branch of MindFormers reduces the Bubble as well as the memory overhead of pipeline parallelism by introducing Sequence dimension splitting. +gengdongjie, zhangyanhui, xiaoxiongzhu, wangshaocong, jianghui58, moran, wangtongyu6, 徐安越, qinzheng, 徐永飞, youshu, XinDu, yeyunpeng2020, yefeng, wangpingan, zjun, 胡安东, 刘力力, 陈宇, chenjianping, kairui_kou, zhangdanyang, hangq, mengyuanli, 刘崇鸣 -#### PyNative +Contributions of any kind are welcome! -- [STABLE] In PyNative mode, communication operators are assigned streams by default based on the communication domain. They support concurrent execution of communication operators, optimize collaborative parallel strategies, provide fine-grained communication masking, and enhance model performance. +## MindSpore Lite 2.2.1 Release Notes ### Bug Fixes -- [IB0R4N](https://gitee.com/mindspore/mindspore/issues/IB0R4N): Fixed the problem of loading distributed weights with inaccurate accuracy under certain splitting strategies. +- [#I88055] Fixed a function issue caused by incorrect format setting of the gridsample operator in MindSpore Lite inference. +- [#I8D80Y] The MindSpore Lite inference single-operator invoking process resources are not released and exits abnormally. ### Contributors -bantao;caifubi;candanzg;chaijinwei;changzherui;chengbin;chujinjin;DeshiChen;dingjinshan;fary86;fuhouyu;gaoyong10;GuoZhibin;halo;haozhang;hedongdong;huangbingjian;hujiahui8;huoxinyou;jiangshanfeng;jiaorui;jiaxueyu;jshawjc;kisnwang;lichen;limingqi107;liubuyu;looop5;luochao60;luoyang;machenggui;MengXiangyu;Mrtutu;NaCN;panzhihui;qiuzhongya;shenhaojing;shilishan;tanghuikang;TuDouNi;wang_ziqi;weiyang;wujueying;XianglongZeng;xuxinglei;yang guodong;yanghaoran;yao_yf;yide12;yihangchen;YijieChen;YingtongHu;yuchaojie;YuJianfeng;zhangdanyang;ZhangZGC;zhengzuohe;zong_shuai;ZPaC;冯一航;胡彬;宦晓玲;李林杰;刘崇鸣;刘勇琪;任新;王禹程;王振邦;熊攀;俞涵;张栩浩;周一航; - -## MindSpore 2.4.0 Release Notes - -### Major Features and Improvements - -#### Dataset - -- [STABLE] Modify the default value of the `max_rowsize` parameter in the interface [mindspore.dataset.GeneratorDataset](https://www.mindspore.cn/docs/en/r2.4.0/api_python/dataset/mindspore.dataset.GeneratorDataset.html), [mindspore.dataset.Dataset.map](https://www.mindspore.cn/docs/en/r2.4.0/api_python/dataset/dataset_method/operation/mindspore.dataset.Dataset.map.html), and [mindspore.dataset.Dataset.batch](https://www.mindspore.cn/docs/en/r2.4.0/api_python/dataset/dataset_method/batch/mindspore.dataset.Dataset.batch.html) to None to enable dynamic allocation of shared memory by default, in which case the shared memory will be requested in real time with the input data and accelerate the data processing, so the user does not need to adjust the size of this parameter in advance. -- [BETA] Data processing supports the independent process mode, which will reduce the GIL lock conflict between the training process and the data reading process to improve the performance in dynamic graph mode. This mode can be enabled or disabled via the environment variable `MS_INDEPENDENT_DATASET`. - -#### Ascend +Thanks goes to these wonderful people: -- [STABLE] Customized operators support the Ascend Dynamics graph scenario Pyboost execution mode, which reduces operator call overhead. -- [STABLE] The Ascend Print operator supports scenarios where the output is oversized tensor or print calls are intensive, and users can specify the slice size and timeout time to support different scenarios via the `MS_DUMP_SLICE_SIZE` and `MS_DUMP_WAIT_TIME` environment variables. -- [STABLE] Unified deterministic computation settings. Users can enable ascending deterministic computation by only setting `mindspore.set_context(deterministic="ON")`. -- [STABLE] Supports aggregate communication anomaly monitoring, quickly exits training after monitoring communication anomalies to avoid timeout waiting. -- [STABLE] Supports the [graceful exit function for sub-healthy devices](https://www.mindspore.cn/docs/en/r2.4.0/model_train/train_availability/graceful_exit.html). When the training framework detects the presence of sub-healthy device configuration information in the cluster, it saves the CKPT and uniformly ends the cluster training process. +zhanghaibo, wangsiyuan, wangshaocong, chenjianping -#### Runtime +Contributions of any kind are welcome! -- [STABLE] Backend compilation cache is supported in O0/O1 mode and is turned on by default when frontend compilation cache is turned on. -- [STABLE] The aclnnAllGatherMatmul, aclnnMatmulReduceScatter, and aclnnMatmulAllReduce algorithms are supported in O0/O1 modes to improve performance. -- [STABLE] O0/O1 modes support to disable cluster heartbeat configuration by export MS_DISABLE_HEARTBEAT=1 to reduce scheduler load. -- [STABLE] O0/O1 modes support communication arithmetic fusion. -- [STABLE] Virtual memory support in O2 mode, defragmentation support, which is enabled in Ascend backend by default. -- [STABLE] Dynamic request for device memory occupation, support single card multi-user use, which is enabled in Ascend backend by default. -- [STABLE] Optimize graph fusion compilation performance in O1 mode, enabled by default. -- [STABLE] Support kernel packet fusion optimization in O1 mode to improve the performance of dynamic shape network execution, enabled by default. -- [BETA] Epilogue fusion between the MatMul and Elementwise operator is supported in O1 mode. Enable via `mindspore.set_context(graph_kernel_flags="--enable_cluster_ops=MatMul")`. -- [BETA] O1 mode supports user-controlled graph fusion optimization scope, user can control to turn on or off the corresponding fusion operator via the enable_pass/disable_pass option of graph_kernel_flags. +## MindSpore Lite 2.2.0 Release Notes -#### PyNative +### Major Features and Improvements -- [STABLE] Parameter cell_id of Hook function corresponding to [mindspore.nn.Cell.register_backward_hook](https://www.mindspore.cn/docs/en/r2.4.0/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.register_backward_hook) and [mindspore.nn.Cell.register_forward_hook](https://www.mindspore.cn/docs/en/r2.4.0/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.register_forward_hook) is changed to cell's python object. -- [STABLE] Added [Cell.register_backward_pre_hook](https://www.mindspore.cn/docs/en/r2.4.0/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.register_backward_pre_hook) interface, this API registers the backward propagation hook function on a Cell, which is called each time the gradient of that Cell is computed. -- [STABLE] Optimize the PyNative process AICPU class operator downstream cache to improve API execution performance. -- [STABLE] Added the function of converting the device memory occupied by a group of Tensor to a contiguous piece of memory under dynamic graph. +#### FlashAttention Operator Fusion -#### FrontEnd +- [STABLE] The OptiX OSN Ascend series supports the FlashAttention large operator fusion of the LLAMA and stable diffusion models. -- [STABLE] Weight de-redundancy saving and loading is supported in fault recovery scenarios. -- [STABLE] Mixed precision training with support for [auto mode](https://www.mindspore.cn/docs/en/r2.4.0/api_python/amp/mindspore.amp.auto_mixed_precision.html#mindspore.amp.auto_mixed_precision). -- [STABLE] Support saving and loading of safetensors format, as well as offline aggregation and distributed loading based on safetensors in parallel scenarios. -- [BETA] Added new cyclic arithmetic interface [mindspore.ops.WhileLoop](https://www.mindspore.cn/docs/en/r2.4.0/api_python/ops/mindspore.ops.WhileLoop.html), [mindspore.ops.ForiLoop](https://www.mindspore.cn/docs/en/r2.4.0/api_python/ops/mindspore.ops.ForiLoop.html), and [mindspore.ops.Scan](https://www.mindspore.cn/docs/en/r2.4.0/api_python/ops/mindspore.ops.Scan.html), optimizing loop compilation time. -- [BETA] The graph mode supports the operator passing keyword arguments. +## MindSpore Lite 2.1.1 Release Notes -#### Parallel +### Major Features and Improvements -- [STABLE] [mindspore.ops.TensorDump](https://www.mindspore.cn/docs/en/r2.4.0/api_python/ops/mindspore.ops.TensorDump.html) operator supports distributed parallel scenarios, and users can decide to print input/output slices by configuring the TensorDump operator's `input_output` attribute; add new interface [mindspore.ops.tensordump](https://www.mindspore.cn/docs/en/r2.4.0/api_python/ops/mindspore.ops.tensordump.html). -- [STABLE] msrun supports customizing the rank id based on the passing rank table file, and supports rearranging the rank id via the `--rank_table_file` passing json file. -- [STABLE] Supports LCCL, a high-performance communication library in Ascend stand-alone. Users can enable LCCL in Ascend back-end training scenarios via the `MS_ENABLE_LCCL` environment variable. -- [STABLE] The strategy propagation algorithm is adapted to LLaMA/Mixtral networks, which reduces the workload of users in configuring the sharding strategy for LLaMA/Mixtral networks. -- [STABLE] Support high dimensional tensor parallelism, user can configure [mindspore.ops.MatMul](https://www.mindspore.cn/docs/en/r2.4.0/api_python/ops/mindspore.ops.MatMul.html) and [mindspore.oops.BatchMatMul](https://www.mindspore.cn/docs/en/r2.4.0/api_python/ops/mindspore.ops.BatchMatMul.html) input_layout switching 1D/2D/3D tensor slice mode. -- [STABLE] Simulation compilation does not consume hardware resources when SIMULATION_LEVEL=0 and SIMULATION_LEVEL=1 runtime jit_level is O0/O1. -- [STABLE] Allreduce introduced in parallel by the BatchMatMul model is automatically converted to a ReduceScatter to reduce communication according to the matching rules if enable_allreduce_slice_to_reducescatter is turned on in parallel_speed_up_json when it follows the slice operation. -- [STABLE] [mindspore.nn.Cell.shard](https://www.mindspore.cn/docs/en/master/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.shard) and [mindspore.shard](https://www.mindspore.cn/docs/en/r2.4.0/api_python/mindspore/mindspore.shard.html) support user-configurable policies of type mindspore.Layout and sharding strategy for each parameter parameter_plan. -- [BETA] SAPP supports fully automatic generation of residual arithmetic policies after manual preconfiguration of arithmetic parallel sharding strategy. The user activates the `.shard()` preconfigured parallel sharding strategy by turning on the `MS_INTERFERED_SAPP` environment variable. -- The [BETA] [mindspore.ops.Custom](https://www.mindspore.cn/docs/en/r2.4.0/api_python/ops/mindspore.ops.Custom.html) operator supports configuring the sharding strategy. +- [STABLE] MindSpore Lite Cloud Inference adds support for Python 3.8 and Python 3.9 -#### Inference +## MindSpore Lite 2.1.0 Release Notes -- [STABLE] New Qwen2 and LLaMA3.1 series of large models support training and inference architecture, realize the unification of script, distributed policy and runtime, reduce the inference delay by fusing large operators, and effectively improve the network throughput. -- [STABLE] Support parallel decoding service-oriented deployment to realize LookAhead speculative inference for large models of LLaMA series. -- [BETA] Support SLoRA service-oriented deployment, realizing multi-trimming weight scheduling inference for large models. +### Major Features and Improvements -#### Dump +#### MindSpore Lite Cloud Inference -- [STABLE] Optimize [Dump](https://www.mindspore.cn/docs/en/r2.4.0/model_train/debug/dump.html) for use by device type and optimization level. -- [STABLE] Asynchronous Dump support in Ascend O0/O1 mode, including asynchronous Tensor, overflow, and statistics (host and device modes). -- [STABLE] Overflow Dump supports configuring the maximum number of overflows. -- [STABLE] Ascend O2 mode supports set dump. -- [STABLE] Support qint4 x 2 quantization type Dump. +- [STABLE] Supports high-performance inference for single-device large model and single-node multi-device distributed large model at Ascend backend. +- [STABLE] Python API Ascend backend supports multiple models sharing workspace memory. +- [STABLE] [The weights can be shared by multiple models through ModelGroup](https://mindspore.cn/lite/docs/en/r2.1/use/cloud_infer/runtime_cpp.html#multiple-models-sharing-weights). For example, weights can be shared between full models and incremental models in the large model scenario. -### API Change +#### API -#### New API - -- [STABLE] [mindspore.mint](https://www.mindspore.cn/docs/en/r2.4.0/api_python/mindspore.mint.html) APIs add a large number of functional, nn interfaces. mint interfaces are currently experimental interfaces, performance is better than ops in graph compilation mode O0 and PyNative mode. Currently does not support graph sink mode and CPU, GPU backend. It will be gradually improved. - - | mindspore.mint | | | | - | :------------------------- | :------------------------------- | :--------------------------- | :------------------------- | - | mindspore.mint.full | mindspore.mint.repeat_interleave | mindspore.mint.linspace | mindspore.mint.scatter | - | mindspore.mint.tril | mindspore.mint.argmin | mindspore.mint.sign | mindspore.mint.remainder | - | mindspore.mint.flatten | mindspore.mint.asin | mindspore.mint.arcsin | mindspore.mint.sinh | - | mindspore.mint.arcsinh | mindspore.mint.atan | mindspore.mint.arctan | mindspore.mint.atanh | - | mindspore.mint.arctanh | mindspore.mint.acos | mindspore.mint.arccos | mindspore.mint.acosh | - | mindspore.mint.arccosh | mindspore.mint.erfc | mindspore.mint.expm1 | mindspore.mint.log1p | - | mindspore.mint.logical_xor | mindspore.mint.round | mindspore.mint.tan | mindspore.mint.trace | - | mindspore.mint.trunc | mindspore.mint.cross | mindspore.mint.masked_select | mindspore.mint.bitwise_and | - | mindspore.mint.bitwise_or | mindspore.mint.bitwise_xor | mindspore.mint.cosh | mindspore.mint.cummax | - | mindspore.mint.cummin | mindspore.mint.median | mindspore.mint.roll | mindspore.mint.sinc | - | mindspore.mint.sinh | mindspore.mint.xlogy | | | - - | mindspore.mint.nn | - | :---------------------------- | - | mindspore.mint.nn.ReLU | - | mindspore.mint.nn.Hardsigmoid | - | mindspore.mint.nn.AvgPool2d | - | mindspore.mint.nn.MSELoss | - | mindspore.mint.nn.LogSoftmax | - | mindspore.mint.nn.Mish | - | mindspore.mint.nn.PReLU | - | mindspore.mint.nn.SELU | - | mindspore.mint.nn.Softshrink | - | mindspore.mint.nn.Hardshrink | - | mindspore.mint.nn.Hardswish | - | mindspore.mint.nn.L1Loss | - - | mindspore.mint.nn.functional | - | :--------------------------------------- | - | mindspore.mint.nn.functional.hardsigmoid | - | mindspore.mint.nn.functional.log_softmax | - | mindspore.mint.nn.functional.mish | - | mindspore.mint.nn.functional.prelu | - | mindspore.mint.nn.functional.selu | - | mindspore.mint.nn.functional.softshrink | - | mindspore.mint.nn.functional.hardshrink | - | mindspore.mint.nn.functional.hardswish | - | mindspore.mint.nn.functional.l1_loss | - | | - -#### Interface Changes - -- Interface name: mindspore.dataset.GeneratorDataset - - Changed: The default value of parameter `max_rowsize` is changed from `6` to `None` to enable dynamic allocation of shared memory by default. - - - - - - - - - -
Original interface v2.4.0 interface
-  class GeneratorDataset(source,
-                         column_names=None,
-                         column_types=None,
-                         schema=None,
-                         num_samples=None,
-                         num_parallel_workers=1,
-                         shuffle=None,
-                         sampler=None,
-                         num_shards=None,
-                         shard_id=None,
-                         python_multiprocessing=True,
-                         max_rowsize=6)
-  
-
-  class GeneratorDataset(source,
-                         column_names=None,
-                         column_types=None,
-                         schema=None,
-                         num_samples=None,
-                         num_parallel_workers=1,
-                         shuffle=None,
-                         sampler=None,
-                         num_shards=None,
-                         shard_id=None,
-                         python_multiprocessing=True,
-                         max_rowsize=None)
-  
-
- -- Interface name: mindspore.dataset.Dataset.batch - - Changed: The default value of parameter `max_rowsize` is changed from `16` to `None` to enable dynamic allocation of shared memory by default. - - - - - - - - - -
Original interface v2.4.0 interface
-  def batch(input_dataset,
-            batch_size,
-            drop_remainder=False,
-            num_parallel_workers=None,
-            per_batch_map=None,
-            input_columns=None,
-            output_columns=None,
-            python_multiprocessing=False,
-            max_rowsize=16)
-  
-
-  def batch(input_dataset,
-            batch_size,
-            drop_remainder=False,
-            num_parallel_workers=None,
-            per_batch_map=None,
-            input_columns=None,
-            output_columns=None,
-            python_multiprocessing=False,
-            max_rowsize=None)
-  
-
- -- Interface name: mindspore.dataset.Dataset.map - - Changed: The default value of parameter `max_rowsize` is changed from `16` to `None` to enable dynamic allocation of shared memory by default. - - - - - - - - - -
Original interface v2.4.0 interface
-  def map(input_dataset,
-          operations=None,
-          input_columns=None,
-          output_columns=None,
-          num_parallel_workers=None,
-          python_multiprocessing=False,
-          cache=None,
-          callbacks=None,
-          max_rowsize=16, offload=None)
-  
-
-  def map(input_dataset,
-          operations=None,
-          input_columns=None,
-          output_columns=None,
-          num_parallel_workers=None,
-          python_multiprocessing=False,
-          cache=None,
-          callbacks=None,
-          max_rowsize=None, offload=None)
-  
-
- -- Interface name: mindspore.ops.TensorDump - - Changed: New parameter `input_output` to control printing behavior. - - - - - - - - - -
Original interface v2.4.0 interface
-  class TensorDump()
-  
-
-  class TensorDump(input_output='out')
-  
-
- -- Interface name: File formats saved by MindSpore Dump Tensor - - Changed: The npy file obtained by Dump adds the dtype information of the original Tensor to the filename. - - - - - - - - - -
Original interface v2.4.0 interface
-  {op_type}.{op_name}.{task_id}.{stream_id}.
-  {timestamp}.{input_output_index}.{slot}.
-  {format}.npy
-  
-
-  {op_type}.{op_name}.{task_id}.{stream_id}.
-  {timestamp}.{input_output_index}.{slot}.
-  {format}.{dtype}.npy
-  
-
- -#### Non-compatible Interface Changes - -- Interface name: mindspore.nn.Cell.register_backward_hook(hook_fn) - - Changed: The input parameter of hook_fn is changed from cell_id to cell object. - - Descriptions: For the original hook, you can get the original cell_id by id(cell) in hook_fn. - - - - - - - - - -
Original interface v2.4.0 interface
-  def register_backward_hook(hook_fn)
-  Parameter hook_fn(cell_id,
-             grad_input, grad_output)
-             -> New grad_output or None
-  
-
-  def register_backward_hook(hook_fn)
-  Parameter hook_fn(cell,
-             grad_input, grad_output)
-             -> New grad_input or None
-  
-
- -- Interface name: mindspore.nn.Cell.register_forward_hook(hook_fn) - - Changed: The input parameter of hook_fn is changed from cell_id to cell object. - - Descriptions: For the original hook, you can get the original cell_id by id(cell) in hook_fn. - - - - - - - - - -
Original interface v2.4.0 interface
-  def register_forward_hook(hook_fn)
-  Parameter hook_fn(cell_id, inputs, outputs)-> New outputs or None
-  
-
-  def register_forward_hook(hook_fn)
-  Parameter hook_fn(cell, inputs, outputs)-> New outputs or None
-  
-
- -- Interface name: mindspore.communication.comm_func.all_reduce - - Changed: all_reduce adds a new parameter async_op, and the return value is changed from Tensor to a tuple consisting of Tensor and CommHandle. - - Descriptions: async_op indicates whether all_reduce has multi-stream parallelism turned on, and the default value is False. - - - - - - - - - -
Original interface v2.4.0 interface
-  def all_reduce(tensor,
-                 op=ReduceOp.SUM,
-                 group=GlobalComm.WORLD_COMM_GROUP)->Tensor
-  
-
-  def all_reduce(tensor,
-                 op=ReduceOp.SUM,
-                 group=GlobalComm.WORLD_COMM_GROUP,
-                 async_op=False)
-                 ->tuple(Tensor, CommHandle)
-  
-
- -- Interface name: mindspore.communication.comm_func.all_gather_into_tensor - - Changed: all_reduce adds a new parameter async_op, and the return value is changed from Tensor to a tuple consisting of Tensor and CommHandle. - - Descriptions: async_op indicates whether all_gather_into_tensor has multi-stream parallelism turned on, and the default value is False. - - - - - - - - - -
Original interface v2.4.0 interface
-  def all_gather_into_tensor(tensor,
-                             group=GlobalComm.
-                             WORLD_COMM_GROUP)->Tensor
-  
-
-  def all_gather_into_tensor(tensor,
-                             group=GlobalComm.
-                             WORLD_COMM_GROUP,
-                             async_op=False)->
-                             tuple(Tensor, CommHandle)
-  
-
- -- Interface name: mindspore.communication.comm_func.reduce_scatter_tensor - - Changed: all_reduce adds a new parameter async_op, and the return value is changed from Tensor to a tuple consisting of Tensor and CommHandle. - - Descriptions: async_op indicates whether reduce_scatter_tensor has multi-stream parallelism turned on, and the default value is False. - - - - - - - - - -
Original interface v2.4.0 interface
-  def reduce_scatter_tensor(tensor,
-                            op=ReduceOp.SUM,
-                            group=GlobalComm.
-                            WORLD_COMM_GROUP)->Tensor
-  
-
-  def reduce_scatter_tensor(tensor,
-                            op=ReduceOp.SUM,
-                            group=GlobalComm.WORLD_COMM_GROUP,
-                            async_op=False)->
-                            tuple(Tensor, CommHandle)
-  
-
- -- Interface name: mindspore.communication.comm_func.isend - - Changed: The return value is changed from Tensor to Handle. - - Descriptions: isend enables multi-stream parallelism by default. - - - - - - - - - -
Original interface v2.4.0 interface
-  def isend(tensor,
-            dst=0,group=GlobalComm.
-            WORLD_COMM_GROUP, tag=0)->Tensor
-  
-
-  def isend(tensor,
-            dst=0,group=GlobalComm.
-            WORLD_COMM_GROUP, tag=0)->CommHandle
-  
-
- -- Interface name: mindspore.communication.comm_func.irecv - - Changed: The return value is changed from Tensor to Handle. - - Descriptions: irecv enables multi-stream parallelism by default. - - - - - - - - - -
Original interface v2.4.0 interface
-  def irecv(tensor,
-            src=0, group=GlobalComm.
-            WORLD_COMM_GROUP, tag=0)->Tensor
-  
-
-  def irecv(tensor,
-            src=0,
-            group=GlobalComm.
-            WORLD_COMM_GROUP, tag=0)->CommHandle
-  
-
- -- Interface name: mindspore.communication.comm_func.all_to_all_with_output_shape - - Changed: all_to_all_with_output_shape adds a new parameter async_op, and the return value is changed from Tensor to a tuple consisting of Tensor and CommHandle. - - Descriptions: async_op indicates whether all_to_all_with_output_shape enables multi-stream parallelism, the default value is False. - - - - - - - - - -
Original interface v2.4.0 interface
-  def all_to_all_with_output_shape(output_shape_list,
-                                   input_tensor_list,
-                                   group=None)->tuple(Tensor)
-  
-
-  def all_to_all_with_output_shape(output_shape_list,
-                                   input_tensor_list,
-                                   group=None,
-                                   async_op=False)->
-                                   tuple(tuple(Tensor),
-                                   CommHandle)
-  
-
- -- Interface name: mindspore.communication.comm_func.all_to_all_single_with_output_shape - - Changed: all_to_all_single_with_output_shape adds a new parameter async_op, and the return value is changed from Tensor to a tuple consisting of Tensor and CommHandle. - - Descriptions: async_op indicates whether all_to_all_single_with_output_shape enables multi-stream parallelism, the default value is False. - - - - - - - - - -
Original interface v2.4.0 interface
-  def all_to_all_single_with_output_shape(output_shape,
-                                          tensor,
-                                          output_split_sizes=None,
-                                          input_split_sizes=None,
-                                          group=None)->Tensor
-  
-
-  def all_to_all_single_with_output_shape(output_shape,
-                                          tensor,
-                                          output_split_sizes=None,
-                                          input_split_sizes=None,
-                                          group=None,
-                                          async_op=False)->
-                                          tuple(Tensor, CommHandle)
-  
-
+The [Python](https://www.mindspore.cn/lite/api/en/r2.1/mindspore_lite/mindspore_lite.ModelGroup.html) and [C++](https://mindspore.cn/lite/api/en/r2.1/generate/classmindspore_ModelGroup.html) ModelGroup interface is added. The interface definition is as follows: -### Contributors +```python +class ModelGroup + def __init__(self, flags=ModelGroupFlag.SHARE_WORKSPACE) + def add_model(self, models) + def cal_max_size_of_workspace(self, model_type, context) +``` -anyrenwei,bantao,baochong,Bellatan,BJ-WANG,caifubi,candanzg,candyhong,Carey,cccc1111,ccsszz,changzherui,chengbin,chengfeng27,chengxb7532,chenjianping,chenweifeng,chujinjin,dairenjie,DavidFFFan,DeshiChen,dingjinshan,emmmmtang,fanyi20,fary86,fengyixing,fix-dryrun,fuchao,fuhouyu,gaoyong10,gengdongjie,gent1e,GuoZhibin,guozhijian,halo,hangq,haozhang,hedongdong,Henry Shi,HighCloud,Hongxing,huandong1,huangbingjian,HuangLe02,huangziling,huda,huiliang166,hujiahui8,huoxinyou,jiangchenglin3,jianghui58,jiangshanfeng,jiaorui,jiaxueyu,jijiarong,jjfeing,JoeyLin,jshawjc,jxl,kairui_kou,kisnwang,kk,lanzhineng,LiangZhibo,lichen,limingqi107,lionelchang,liubuyu,liujunzhu,liuluobin,liyejun,LLLRT,looop5,luochao60,luoxuewei,luoyang,machenggui,maning202007,maoyuanpeng1,Margaret_wangrui,MengXiangyu,mengyuanli,moran,Mrtutu,mylinchi,NaCN,nomindcarry,panzhihui,paolopoggi,pengqi,pierreleca,qiuleilei,qiuyufeng,qiuzhongya,r1chardf1d0,shaoshengqi,shen_haochen,shenhaojing,shenwei41,shihlCST,shilishan,shiro-zzz,shiziyang,shop-pin,shunyuanhan,shuqian0,stavewu,superxf,suteng,tanghuikang,tangmengcheng,tan-wei-cheng,tan-wei-cheng-3260,tianxiaodong,TronZhang,TuDouNi,VectorSL,vincen45,wang_ziqi,wanghenchang,wangjie,wangshaocong,weiyang,wtobill,wudawei,wujueying,wwwbby,xfan233,XianglongZeng,xiaotianci,xiaoxin_zhang,xiaoxiongzhu,xiaoxuanKL,xiaoyao,XinDu,xuxinglei,xuzhubin,yanghaoran,yanglong,yangzhenzhang,yanx,Yanzhi_YI,yao_yf,yefeng,yide12,yihangchen,YijieChen,YingLai Lin,ylw,yuanpeng2024,yuanqi,yuchaojie,Yuheng Wang,YuJianfeng,YukioZzz,yyuse,zangqx,ZeyuHan,zhangbuxue,zhanghaibo,zhangminli,zhangqinghua,zhangyanhui,ZhangZGC,zhangzhen,zhanzhan,zhengzuohe,zhouyaqiang0,zhuguodong,zichun_ye,zjun,zong_shuai,ZPaC,zuochuanyong,zyli2020,程超,蛋蛋de忧桑,狄新凯,范吉斌,冯一航,付国华,胡彬,宦晓玲,黄勇,黄卓,康伟,李良灿,李林杰,李寅杰3,刘崇鸣,刘思铭,刘涛Liu,刘勇琪,刘子涵,吕浩宇,吕昱峰(Nate.River),钱丹,十一雷,孙昊辰,王禹程,王振邦,王梓润,吴大维,熊攀,徐安越,许子豪,俞涵,云骑士,张峻源,张王泽,张栩浩,赵文璇,周莉莉,朱家兴,邹文祥 +```C++ +// class ModelGroup +ModelGroup(ModelGroupFlag flags = ModelGroupFlag::kShareWorkspace); +Status AddModel(const std::vector &model_path_list); +Status AddModel(const std::vector> &model_buff_list); +Status AddModel(const std::vector &model_list); +Status AddModel(const std::vector &model_list); +``` -## MindSpore 2.3.1 Release Notes +## MindSpore Lite 2.0.0-rc1 Release Notes ### Major Features and Improvements -- [STABLE] Remove the restriction that the value of device_matrix must be 2 correspongding to interleaved_parallel when using [Layout](https://www.mindspore.cn/docs/en/r2.3.1/api_python/mindspore/mindspore.Layout.html) to construct the parallel strategy. -- [STABLE] Add user-defined control edges environment [MS_CUSTOM_DEPEND_CONFIG_PATH](https://www.mindspore.cn/docs/en/r2.3.1/note/env_var_list.html) support to achieve better overlapping of communication and computation. +#### MindSpore Lite Cloud Inference -### API Change +The original MindSpore Lite is mainly used for edge devices such as mobile phones and head units. Cloud inference is added to support scenarios with multiple backend hardware resources on the cloud, supports Ascend and NVIDIA GPU inference cards, and efficiently utilizes multi-core resources on the cloud. -#### New API +The original cloud inference integrated through MindSpore training can be changed to MindSpore Lite. For details, see [Quick Start to Cloud-side Inference](https://mindspore.cn/lite/docs/en/r2.0/quick_start/one_hour_introduction_cloud.html). To retain the original integration method, see [Inference](https://mindspore.cn/docs/en/r2.0/faq/inference.html). -- [STABLE] Add new API [mindspore.mint.repeat_interleave](https://www.mindspore.cn/docs/en/r2.3.1/api_python/mint/mindspore.mint.repeat_interleave.html). +- [STABLE] Support MindIR model files. +- [STABLE] Third-party Onnx, TensorFlow, and Caffe models can be converted to MindIR model files using the MindSpore Lite conversion tool. +- [STABLE] One release package supports multiple hardware backends: Ascend, NVIDIA GPU, CPU. +- [STABLE] Supports the `Model` interface and `ModelParallelRunner` concurrent inference interface. +- [STABLE] Supports C++, Python, and Java inference interfaces. -### Contributors +#### API -ccsszz;dairenjie;DeshiChen;fuhouyu;gaoshuanglong;gaoyong10;GuoZhibin;halo;huoxinyou;jiangchao_j;jiaorui;jiaxueyu;jijiarong;JuiceZ;lichen;liujunzhu;liuluobin;LLLRT;looop5;luoyang ;Margaret_wangrui;mengyuanli;panzhihui;pengqi;PingqiLi;Renyuan Zhang;tanghuikang;tianxiaodong;TuDouNi;wudawei;XianglongZeng;xiaosh;xiaoxin_zhang;XinDu;yanghaoran;yanglong;yangruoqi713;Yanzhi_YI;yao_yf;YijieChen;yuchaojie;YuJianfeng;zangqx;zhengzuohe;zhouyaqiang0;ZPaC;zyli2020;胡彬;宦晓玲;康伟;李林杰;刘崇鸣;王禹程;俞涵;周莉莉;邹文祥 +- Due to the defects of the original Python API that many configuration parameters and complex usage, the usability of The Python APIs are optimized in version 2.0. The optimizations include class construction methods and class attribute adjustment. In addition, the Python APIs in version 2.0 and later will be integrated into the cloud-side inference scenario, which are incompatible with Python APIs of the earlier versions. For details, see [Python API](https://www.mindspore.cn/lite/api/en/r2.0/mindspore_lite.html). -Contributions of any kind are welcome! +## MindSpore Lite 1.10.0 Release Notes -## MindSpore Lite 2.3.1 Release Notes +### Bug fixes -### Major Features and Improvements +- Fixed potential accuracy problem of arithmetic type CPU kernels at dynamical shape case. +- Fixed the Incorrect Write Address of the Deconv Quantization Operator. -When converting Ascend backend models, the [input_shape](https://www.mindspore.cn/lite/docs/en/r2.3.1/use/cloud_infer/converter_tool_ascend.html) parameter in the configuration file is supported to specify the input size. +## MindSpore Lite 1.8.0 Release Notes -### API Change +### Major Features and Improvements -- [ModelGroup](https://www.mindspore.cn/lite/docs/en/r2.3.1/use/cloud_infer/runtime_cpp.html) interface adds model weight sharing support to save video memory. +#### API -- [Model.get_model_info](https://www.mindspore.cn/lite/docs/en/r2.3.1/use/converter_tool.html?highlight=get_model_info) interface adds support for obtaining the input size of the model. +- [STABLE] Add C++ and Python APIs for model conversion. +- [STABLE] Add Python APIs for model inference. -### Contributors +#### Post-Training Quantization -熊攀;ZhangZGC;jxl;zhangyanhui;emmmmtang;huandong1;yefeng +- [STABLE] Support perlayer quantization, and built-in CLE to optimize perlayer quantization accuracy. -## MindSpore 2.3.0 Release Notes +## MindSpore Lite 1.7.0 Release Notes ### Major Features and Improvements -#### AutoParallel - -- [STABLE] Extend functional parallelism. [mindspore.shard](https://www.mindspore.cn/docs/en/r2.3.0/api_python/mindspore/mindspore.shard.html) supports now the Graph mode. In Graph mode, the parallel sharding strategy of input and weight can be set for nn.Cell/function. For other operators, the parallel strategy can be automatically configured through "sharding_propagation". Add [mindspore.reshard](https://www.mindspore.cn/docs/en/r2.3.0/api_python/mindspore/mindspore.reshard.html) interface that supports manual rearranging and set up a precise sharding strategy ([mindspore.Layout](https://www.mindspore.cn/docs/en/r2.3.0/api_python/mindspore/mindspore.Layout.html)) for tensors. -- [STABLE] Added Callback interface [mindspore.train.FlopsUtilizationCollector](https://www.mindspore.cn/docs/en/r2.3.0/api_python/train/mindspore.train.FlopsUtilizationCollector.html) statistical model flops utilization information MFU and hardware flops utilization information HFU. -- [STABLE] Add functional communication API [mindspore.communication.comm_func](https://www.mindspore.cn/docs/en/r2.3.0/api_python/mindspore.communication.comm_func.html). -- [BETA] Optimize the memory usage of interleaved pipeline in O0 and O1 mode. -- [BETA] AutoParallel supports automatic pipeline strategy generation in multi-nodes scenarios (not supported in single-node scenario). Need to set `parallel_mode` to ``auto_parallel`` and `search_mode` to ``recursive_programming``. +#### Post quantization -#### PyNative +- [STABLE] Support post quantization to run dynamic quantization algorithm. +- [BETA] Support post quantized model to run on NVIDIA GPU. -- [STABLE] Optimize the basic data structure of PyNative and improve operator API performance. -- [STABLE] Tensor supports [register_hook](https://www.mindspore.cn/docs/en/r2.3.0/api_python/mindspore/Tensor/mindspore.Tensor.register_hook.html) so that users can print or modify the gradient with respect to the tensor. -- [STABLE] The PyNative mode supports the recompute function. You can use the recompute interface to reduce the peak device memory of the network. +## MindSpore Lite 1.6.0 -#### FrontEnd +### Major Features and Improvements -- [STABLE] Optimize Checkpoint saving and loading basic processes to improve performance by 20%. -- [STABLE] Support CRC verification of Checkpoint files during saving and loading processes to enhance security. +#### Converter and runtime -#### Dataset +- [STABLE] Add more fusion patterns in the converter tool to improve runtime performance. +- [STABLE] Support take OpenGL texture as input and output of inference. +- [STABLE] Refactor the JAVA API. +- [BETA] Support inference on Ascend310. -- [STABLE] Support Ascend processing backend for the following transforms: Equalize, Rotate, AutoContrast, Posterize, AdjustSharpness, Invert, Solarize, ConvertColor, Erase. -- [STABLE] Support video files reading and parsing function. For more detailed information, see APIs: [mindspore.dataset.vision.DecodeVideo](https://www.mindspore.cn/docs/en/r2.3.0/api_python/dataset_vision/mindspore.dataset.vision.DecodeVideo.html), [mindspore.dataset.vision.read_video](https://www.mindspore.cn/docs/en/r2.3.0/api_python/dataset_vision/mindspore.dataset.vision.read_video.html#mindspore.dataset.vision.read_video), and [mindspore.dataset.vision.read_video_timestamps](https://www.mindspore.cn/docs/en/r2.3.0/api_python/dataset_vision/mindspore.dataset.vision.read_video_timestamps.html#mindspore.dataset.vision.read_video_timestamps). -- [STABLE] Support specifying the `max_rowsize` parameter as -1 in `mindspore.dataset.GeneratorDataset`, `mindspore.dataset.Dataset.map` and `mindspore.dataset.Dataset.batch` interfaces. The size of shared memory used by the dataset multiprocessing will be dynamically allocated according to the size of the data. The `max_rowsize` parameter does not need to be adjusted manually. +#### x86 backend optimization -#### Inference +- [STABLE] Optimize kernels for x86 using Advanced Vector Extensions(AVX512). -- [STABLE] 14 large models such as LLaMa2, LLaMa3, and Qwen1.5 are added to support the integrated training and inference architecture to unify scripts, distributed strategies, and runtime. The period from training to inference deployment of typical large models is reduced to days. Large operators are integrated to reduce the inference latency and effectively improve the network throughput. +#### ARM backend optimization -#### PIJIT +- [STABLE] Support heterogeneous parallel inference, including splitting operators, constructing heterogeneous subgraphs, and heterogeneous parallel scheduling between CPUs and GPUs. +- [STABLE] Add more FP16 operators. -- [BETA] Support bytecode parsing for Python 3.8 and Python 3.10 to expand the supporting version of Python. -- [BETA] Support dynamic shape and symbolic shape as input to enable the dynamic input scenarios. -- [BETA] Enable single-step composition capability to optimize compile time -- [BETA] Support bytecode capture with side effects (STORE_ATTR, STORE_GLOBAL, LIST_APPEND, dict.pop) by bytecode tuning, enabling auto-mixed precision, reduction of cleavage diagrams, and improved performance. +#### Post quantization -#### Profiler +- [STABLE] Post quantization supports debugging. +- [STABLE] Full quantization supports choosing non-quantized nodes. +- [STABLE] Mixed bit quantization supports auto-tune. -- [STABLE] Provides a hierarchical Profiler function, controls different levels of performance data collection through the profiler_level parameter. -- [STABLE] Profiler analyse adds a new mode parameter to configure asynchronous parsing mode to parallelize performance data parsing and training. -- [STABLE] The Profiler adds a new data_simplification parameter, which allows users to control whether to delete redundant data after parsing the performance data to save hard disk space. -- [STABLE] The Profiler enhances the memory analysis function. Users can collect the memory application and release information of the framework, CANN and hardware through the profile_memory parameter, and visualize and analyze the information through the [MindStudio tool](https://www.hiascend.com/developer/blog/details/0230130822583032044). -- [BETA] In Pynative mode, Timeline integrates host profiling information, including task time and user side stack information. +#### Training on Device -#### Dump +- [STABLE] Support user-defined algorithm models to access the federated learning framework. -- [STABLE] Enhanced synchronous & asynchronous dump functionality and adds L2Norm information to statistics dumps, and the statistic_category field to allow users to customize which statistics to save, improving dump usability. For details about the support for synchronous/asynchronous dump, see [Dump Introduction](https://www.mindspore.cn/tutorials/experts/en/r2.3.0/debug/dump.html#dump-introduction). -- [STABLE] Improved synchronous dump functionality: Enables overflow and exception dumps through the op_debug_mode field. -- [STABLE] Enhanced synchronous dump functionality: The stat_calc_mode field enables device-side computation of statistics (default is host-side), and the sample_mode field is configured to perform sample-based dumps, improving dump performance. -- [STABLE] Enhanced asynchronous dump functionality: Now supports saving in complex64 and complex128 formats. +### Contributors -#### Runtime +Thanks goes to these wonderful people: -- [Stable] Supports multi-level compilation of the staic graph by setting [mindspore.set_context(jit_config={"jit_level": "O0/O1/O2"})](https://www.mindspore.cn/docs/en/r2.3.0/api_python/mindspore/mindspore.set_context.html). The default value is empty, the framework automatically selects the optimization level according to the product category, O2 for Atlas training products and O0 for the rest of the products. -- [Stable] Staic graph supports multi-stream concurrent execution of communication calculations in O0/O1. -- [STABLE] Add memory management API [mindspore.hal.memory](https://www.mindspore.cn/docs/en/r2.3.0/api_python/mindspore.hal.html#memory). -- [Beta] The memory pool supports virtual memory defragmentation, and virtual memory is enabled by default under graph O0/O1. +AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, [wangnan39@huawei.com](mailto:wangnan39@huawei.com), wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, [zhanghaibo5@huawei.com](mailto:zhanghaibo5@huawei.com), zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. -#### Ascend +Contributions of any kind are welcome! -- [STABLE] Provide an operator memory out of bounds access detection switch on the Ascend platform, where users can detect internal memory out of bounds issues of operators on the Ascend platform by setting `mindspore.set_context (Ascend_configuration={"op_debug_option": "oom"})`. -- [BETA] The environment variable [MS_SIMULATION_LEVEL](https://www.mindspore.cn/docs/en/r2.3.0/note/env_var_list.html) supports graph compilation O0 execution mode on the Ascend platform, which can support compilation performance and runtime memory analysis -- [BETA] Ascend platform supports [AscendC custom operators](https://www.mindspore.cn/tutorials/experts/en/r2.3.0/operation/op_custom_ascendc.html) through AOT. +## MindSpore Lite 1.5.0 -### API Change +### Major Features and Improvements -#### New APIs - -- [STABLE] Adds [mindspore.mint](https://www.mindspore.cn/docs/en/r2.3.0/api_python/mindspore.mint.html) API, provides a lot of functional, nn, optimizer interfaces. The API usage and functions are consistent with the mainstream usage in the industry, which is convenient for users to refer to and use. The mint interface is currently an experimental interface and performs better than ops in `jit_level="O0"` and pynative mode. Currently, the graph sinking mode and CPU/GPU backend are not supported, and it will be gradually improved in the future. - - | mindspore.mint | | | | - |:----|:----|:----|:----| - | mindspore.mint.eye |mindspore.mint.rand_like|mindspore.mint.isfinite|mindspore.mint.any| - | mindspore.mint.ones |mindspore.mint.rand|mindspore.mint.log|mindspore.mint.greater_equal| - | mindspore.mint.ones_like |mindspore.mint.gather|mindspore.mint.logical_and|mindspore.mint.all| - | mindspore.mint.zeros |mindspore.mint.permute|mindspore.mint.logical_not|mindspore.mint.mean| - | mindspore.mint.zeros_like |mindspore.mint.repeat_interleave|mindspore.mint.logical_or|mindspore.mint.prod| - | mindspore.mint.arange |mindspore.mint.abs|mindspore.mint.mul|mindspore.mint.sum| - | mindspore.mint.broadcast_to |mindspore.mint.add|mindspore.mint.neg|mindspore.mint.eq| - | mindspore.mint.cat |mindspore.mint.clamp|mindspore.mint.negative|mindspore.mint.ne| - | mindspore.mint.index_select |mindspore.mint.cumsum|mindspore.mint.pow|mindspore.mint.greater| - | mindspore.mint.max |mindspore.mint.atan2|mindspore.mint.reciprocal|mindspore.mint.gt| - | mindspore.mint.min |mindspore.mint.arctan2|mindspore.mint.rsqrt|mindspore.mint.isclose| - | mindspore.mint.scatter_add |mindspore.mint.ceil|mindspore.mint.sigmoid|mindspore.mint.le| - | mindspore.mint.narrow |mindspore.mint.unique|mindspore.mint.sin|mindspore.mint.less_equal| - | mindspore.mint.nonzero |mindspore.mint.div|mindspore.mint.sqrt|mindspore.mint.lt| - | mindspore.mint.normal |mindspore.mint.divide|mindspore.mint.square|mindspore.mint.maximum| - | mindspore.mint.tile |mindspore.mint.erf|mindspore.mint.sub|mindspore.mint.minimum| - | mindspore.mint.topk |mindspore.mint.erfinv|mindspore.mint.tanh|mindspore.mint.inverse| - | mindspore.mint.sort |mindspore.mint.exp|mindspore.mint.bmm|mindspore.mint.searchsorted| - | mindspore.mint.stack |mindspore.mint.floor|mindspore.mint.matmul|mindspore.mint.argmax| - | mindspore.mint.where |mindspore.mint.flip|mindspore.mint.split|mindspore.mint.cos| - | mindspore.mint.less ||| - - | mindspore.mint.nn| - |:----| - | mindspore.mint.nn.Dropout | - | mindspore.mint.nn.Unfold | - | mindspore.mint.nn.Fold | - | mindspore.mint.nn.Linear| - | mindspore.mint.nn.BCEWithLogitsLoss | - - | mindspore.mint.nn.functional|| - |:----|:----| - |mindspore.mint.nn.functional.batch_norm |mindspore.mint.nn.functional.group_norm| - |mindspore.mint.nn.functional.fold |mindspore.mint.nn.functional.layer_norm| - |mindspore.mint.nn.functional.max_pool2d |mindspore.mint.nn.functional.linear| - |mindspore.mint.nn.functional.binary_cross_entropy |mindspore.mint.nn.functional.unfold| - |mindspore.mint.nn.functional.sigmoid |mindspore.mint.nn.functional.one_hot| - |mindspore.mint.nn.functional.tanh |mindspore.mint.nn.functional.elu| - |mindspore.mint.nn.functional.binary_cross_entropy_with_logits |mindspore.mint.nn.functional.gelu| - |mindspore.mint.nn.functional.dropout|mindspore.mint.nn.functional.leaky_relu| - |mindspore.mint.nn.functional.embedding |mindspore.mint.nn.functional.silu| - |mindspore.mint.nn.functional.grid_sample|mindspore.mint.nn.functional.softplus| - |mindspore.mint.nn.functional.relu|mindspore.mint.nn.functional.softmax| - |mindspore.mint.nn.functional.pad|| - - | mindspore.mint.optim | - |:----| - | mindspore.mint.optim.AdamW | - - | mindspore.mint.linalg | - |:----| - | mindspore.mint.linalg.inv | - -### Non-compatible Interface Changes - -- Interface name: `Profiler` - - Changes: The performance data file generated by parsing is streamlined to save space. Delete the FRAMEWORK directory data and other redundant data after exporting the performance data. Retain only the deliverables of the profiler and the original performance data in the PROF_XXX directory to save space. Data simplification mode can be turned off by configuring the `data_simplification` parameter to `False`, which will be consistent with the performance data files generated by the historical version. - -- Interface name: The `saved_data` field in the configuration file of the dump function is `"tensor"`. - - Changes: The name of the file to be dumped to disks is changed. `"/"` is replaced with `"_"`, and the operator name is changed to the global name of the operator. - - - - - - - - - -
Original interface v2.1 interface
-  File name format:
-  {op_type}.{op_name}.{task_id}.{stream_id}.
-  {timestamp}.{input_output_index}.{slot}.{format}.npy
-  
- Example: - Conv2D.Conv2D-op12.0.0.1623124369613540. - output.0.DefaultFormat.npy -
-
-  File name format:
-  {op_type}.{op_name}.{task_id}.{stream_id}.
-  {timestamp}.{input_output_index}.{slot}.{format}.npy
-  
- Example: - Conv2D.Default_network-WithLossCell__backbone-AlexNet_conv3 - -Conv2d_Conv2D-op12.0.0.1623124369613540.output.0.DefaultFormat.npy -
-
- -- Interface name: The `saved_data` field in the Dump function configuration file is `"statistic"`. - - Changes: By default, `'max'`, `'min'`, `'avg'`, `'count'`, `'negative zero count'`, `'positive zero count'`, `'nan count'`, `'negative inf count'` ,`'positive inf count'`,`'zero count'` and `'md5'`. In the 2.3 version, the `'max'`, `'min'`, and `'l2norm'` statistical items are saved by default. You can customize statistical items by configuring `'statistic_category'`. +#### Converter and runtime -### Contributors +1. Optimize TDNN-like streaming model by reusing the result of last inference. +2. Support dynamic filter Convolution. +3. Support serializing float32 weight into float16 weight for reducing size of model file. +4. Provide unified runtime API for developer reusing their code between cloud side and end side. +5. Now developer can configure built-in pass as custom passes. +6. Now user can specify format and shape of model inputs while converting model. +7. Support multiple devices inference, includeing CPU, NPU, GPU. User can set devices in mindspore::Context. +8. Support mixed precision inference. User can set inference precision by LoadConfig API. +9. Support custom operator registration and enable inference on third-party hardware. -caifubi;candanzg;ccsszz;chaiyouheng;changzherui;chenfei_mindspore;chengbin;chengfeng27;Chong;dairenjie;DavidFFFan;DeshiChen;dingjinshan;douzhixing;emmmmtang;Erpim;fary86;fengyixing;fuhouyu;gaoyong10;GuoZhibin;guozhijian;halo;haozhang;hejianheng;Henry Shi;horcham;huandong1;huangbingjian;Jackson_Wong;jiangchenglin3;jiangshanfeng;jiangzhenguang;jiaorui;bantao;jiaxueyu;jijiarong;JuiceZ;jxl;kairui_kou;lanzhineng;LiangZhibo;lichen;limingqi107;linqingke;liubuyu;liujunzhu;liuluobin;liyan2022;liyejun;LLLRT;looop5;lujiale;luochao60;luoyang;lvxudong;machenggui;maning202007;Margaret_wangrui;master_2;mengyuanli;moran;Mrtutu;NaCN;nomindcarry;panzhihui;pengqi;qiuyufeng;qiuzhongya;Renyuan Zhang;shaoshengqi;Shawny;shen_haochen;shenhaojing;shenwei41;shij1anhan;shilishan;shiziyang;shunyuanhan;shuqian0;TAJh;tanghuikang;tan-wei-cheng;Thibaut;tianxiaodong;TronZhang;TuDouNi;VectorSL;wang_ziqi;wanghenchang;wangjie;weiyang;wudawei;wujiangming;wujueying;XianglongZeng;xiaotianci;xiaoxin_zhang;xiaoxiongzhu;xiaoyao;XinDu;xuxinglei;yangchen;yanghaoran;yanglong;yangruoqi713;yangzhenzhang;yangzishuo;Yanzhi_YI;yao_yf;yefeng;yide12;YijieChen;YingLai Lin;yuchaojie;YuJianfeng;zangqx;zhaiyukun;zhangminli;zhangqinghua;ZhangZGC;zhengxinQian;zhengzuohe;zhouyaqiang0;zhuguodong;zhupuxu;zichun_ye;zjun;zlq2020;ZPaC;zuochuanyong;zyli2020;阿琛;狄新凯;范吉斌;冯一航;胡彬;宦晓玲;黄勇;康伟;雷仪婧;李良灿;李林杰;刘崇鸣;刘力力;刘勇琪;刘子涵;吕浩宇;王禹程;熊攀;徐安越;徐永飞;俞涵;张王泽;张栩浩;郑裔;周莉莉;周先琪;朱家兴;邹文祥 +#### ARM backend optimization -Contributions of any kind are welcome! +1. Support the nchw data format of some Operators, such as Conv, InstanceNorm, etc. The performance of some models convertered from onnx and caffe is greatly improved. +2. Fix bugs of memory leak on NPU. -## MindSpore 2.3.0-rc2 Release Notes +#### Post quantization -### Major Features and Improvements +1. Weight quantization supports mixed bit quantization. +2. Full quantization supports data pre-processing. +3. Adjust the quantization parameters from the command line to the configuration file. -#### AutoParallel +#### Training on Device -- [STABLE] Transpose/Sub/Add/Mul/Div/ReLU/Softmax/Sigmoid supports layout configuration. -- [STABLE] The collective communication precision will affect network convergence. The configuration item [force_fp32_communication](https://www.mindspore.cn/docs/en/r2.3.0rc2/api_python/mindspore/mindspore.set_auto_parallel_context.html) is provided in the interface mindspore.set_auto_parallel_context. When set to True, the communication type of the reduce communication operator can be forced to be converted to float32. -- [BETA] Pipeline parallel support Interleave. Optimize the performance when micro batch is limited. -- [BETA] Optimize checkpoint transformation speed when using pipeline parallel, support single stage transform. +1. Unify lite external api with MindSpore. +2. Implement static memory allocator and common workspace for TOD,save memory 10-20%. +3. Provide getgradients and setgradients interface,get and set optimizer params interfaces to support MOE Model. +4. Support user specified output node when export IOD Model. +5. Support more text networks (tinybert,albert) and operators. -#### PyNative +#### Codegen -- [BETA] Support [recompute](https://www.mindspore.cn/docs/en/r2.3.0rc2/api_python/mindspore/mindspore.recompute.html) on PyNative mode. -- [STABLE] Support [register_hook](https://www.mindspore.cn/docs/en/r2.3.0rc2/api_python/mindspore/Tensor/mindspore.Tensor.register_hook.html#mindspore.Tensor.register_hook) on PyNative mode. +1. Support kernel register for custom op. Third-party hardware like NNIE can be accessed through it. ### API Change -Add timeout environment variables in [dynamic networking](https://www.mindspore.cn/tutorials/experts/en/r2.3.0rc2/parallel/dynamic_cluster.html) scenarios: - -- `MS_TOPO_TIMEOUT`: Cluster networking phase timeout time in seconds. -- `MS_NODE_TIMEOUT`: Node heartbeat timeout in seconds. -- `MS_RECEIVE_MSG_TIMEOUT`: Node timeout for receiving messages in seconds. - -Added new environment variable `MS_ENABLE_LCCL` to support the use of LCCL communication library. - -### Bug Fixes +#### API Incompatible Change -- [#I9CR96](https://gitee.com/mindspore/mindspore/issues/I9CR96) Fix the issue of insufficient timeout time causing failure for dynamic networking startup in large-scale clusters. -- [#I94AQQ](https://gitee.com/mindspore/mindspore/issues/I94AQQ) Fixed the problem of incorrect output shape of ops.Addcdiv operator in graph mode. +##### C++ API ### Contributors Thanks goes to these wonderful people: -bantao,caifubi,changzherui,chenfei_mindspore,chenweifeng,dairenjie,dingjinshan,fangzehua,fanyi20,fary86,GuoZhibin,hanhuifeng,haozhang,hedongdong,Henry Shi,huandong1,huangbingjian,huoxinyou,jiangchenglin3,jiangshanfeng,jiaorui,jiaxueyu,jxl,kairui_kou,lichen,limingqi107,liuluobin,LLLRT,looop5,luochao60,luojianing,maning202007,NaCN,niyuxin94520,nomindcarry,shiziyang,tanghuikang,TronZhang,TuDouNi,VectorSL,wang_ziqi,wanghenchang,wudawei,XianglongZeng,xiaoxiongzhu,xiaoyao,yanghaoran,Yanzhi_YI,yao_yf,yide12,YijieChen,YingLai Lin,yuchaojie,YuJianfeng,zangqx,zhanghanLeo,ZhangZGC,zhengzuohe,zhouyaqiang0,zichun_ye,zjun,ZPaC,zyli2020,冯一航,李林杰,刘力力,王禹程,俞涵,张栩浩,朱家兴,邹文祥 +Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, Zhenglong Li, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. Contributions of any kind are welcome! -## MindSpore Lite 2.3.0-rc2 Release Notes +## MindSpore Lite 1.3.0 ### Major Features and Improvements -- [STABLE] Support the configuration of FlashAttention related properties in the configuration file used by the cloud-side conversion tool. -- [STABLE] Support multi-devices memory sharing. +#### Converter and runtime -### Contributors +1. Support Caffe model running on Hi3516D. +2. Support delegate mechanism to run your models(part or whole) on user specified executor. +3. Support control flow models. +4. Support cross-compiling for iOS, so that we can inference models on iOS devices. -Thanks goes to these wonderful people: +#### x86 backend optimization -emmmmtang,熊攀 +1. Optimize kernels for x86 using Advanced Vector Extensions(AVX). -Contributions of any kind are welcome! +#### ARM backend optimization -## MindSpore 2.3.0-rc1 Release Notes +1. Optimize fp16 kernels. +2. Support arm32 fp16 instruction acceleration on ARMv8.2. -### Major Features and Improvements +#### Cuda backend optimization -#### DataSet +1. Support NV GPU backend base on delegate mechanism(use TensorRT as delegate). -- [STABLE] Support integrity check, encryption and decryption check for MindRecord to protect the integrity and security of user data. -- [STABLE] MindRecord api changes: FileWriter.open_and_set_header is deprecated since it has been integrated into FilterWriter, if the old version code reports an error, delete this call; Add type checking for data in FileWriter to ensure that the data type defined by the Schema matches the real data type; The return value of all methods under Mindrecord are removed, replaced by an exception when processing error is occurred. -- [STABLE] Support Ascend processing backend for the following transforms: ResizedCrop, HorizontalFlip, VerticalFlip, Perspective, Crop, Pad, GaussianBlur, Affine. -- [STABLE] Optimized the content of data processing part in model migration guide, providing more examples to compare with third-party frameworks. -- [STABLE] Optimized the parsing efficiency of TFRecordDataset in multiple data columns scenario, improving the parsing performance by 20%. +#### OpenCL backend -#### PIJIT +1. Optimize the strategy of workgroup and blocksize to improve performance. +2. Support OpenCL dynamic infershape. +3. Support INT32 type ops. -- [BETA]PIJit analyzes and adjusts the Python bytecode and performs graph capture and graph optimization on the execution flow. Supported Python codes are executed in static graph mode, and unsupported ones are divided into subgraphs and executed in dynamic graph mode, automatically achieving dynamic and static unification. Users can enable the PIJit function by decorating the function with @jit(mode="PIJit", jit_config={options:value}). +#### Post quantization -#### Inference +1. Support fp32 training model converts to quantization training model. -- [DEMO] The integrated architecture of large model inference, upgrade, training, and promotion unifies scripts, distributed policies, and runtime. The period from training to inference deployment of typical large models is reduced to days. Large operators are integrated to reduce the inference latency and effectively improve the network throughput. +#### Training on Device -#### AutoParallel +1. Support fp32 training model export to quantization model after training process end. +2. Unify APIs and output package name of training and inference. +3. Simplify implementation of Train Session. +4. Optimize train and infer compile, reduce libmindspore-lite-train.so memory. +5. Training memory optimization: memory reduce 10-50% compare with r1.2. +6. Training performance optimization: for 1*1 special input shape Cov2DGradInput and SparseSoftmaxCrossEntropyWithLogits operator optimization, improved 10%-20%. +7. Support more networks(transformer, albert). -- [STABLE] Add msrun startup method to launch distributed job with single instruction. -- [STABLE] Add to be deprecated hint for RankTable startup method. -- [STABLE] Eliminate redundant constants in graph mode to improve compilation performance and memory overhead. -- [STABLE] The subgraph scenario optimizer parallelizes the first subgraph inline, allowing some computation and communication masking under pipeline parallelism to be performed. -- [STABLE] Communication information export: export model communication information (communication domain, communication volume) during compilation, and input it to the cluster as the basis for communication scheduling. -- [STABLE] Pipeline parallel inference is optimized, eliminates shared weights forwarding between stages, improving execution performance. Supports automatic broadcast of pipeline inference results, improving the usability of autoregressive inference. -- [STABLE] Operator-level parallel sharding supports the configuration of the mapping between the device layout and tensor layout during MatMul/Add/LayerNorm/GeLU/BiasAdd operator sharding. -- [STABLE] Supports gradient communication and backward calculation overlapping in the data parallel dimension. -- [STABLE] Single device simulation compilation, used to simulate the compilation process of a certain device in multi device distributed training, assisting in analyzing the compilation processes and memory usage on the front and back ends. -- [STABLE] Implement ops.Tril sharding to reduce the memory and performance requirements on a single device. -- [BETA] Supports the fusion between communication operators and computing operators, in order to overlap communication overheads with computation and improve network performance. -- [BETA] Load checkpoints and compile graphs in parallel to accelerate fault recovery. +#### Codegen -#### Runtime +1. Support deployment on HarmonyOS for device. -- [BETA] Support O0/O1/O2 multi-level compilation to improve static graph debugging and tuning capabilities. +### API Change -#### FrontEnd +#### API Incompatible Change -- [STABLE] The framework supports the bfloat16 data type. dtype=mindspore.bfloat16 can be specified when a tensor is created. -- [STABLE] The syntax support capability of the rewrite component is optimized, syntaxs such as class variables, functions, and control flows can be parsed. -- [STABLE] New context setting: debug_level. User can use mindspore.set_context(debug_level=mindspore.DEBUG) to get more debug information. +##### C++ API -#### Profiler +###### Unify LiteSession and TrainSession, Merge LiteSession And TrainSession.([!17356](https://gitee.com/mindspore/mindspore/pulls/17356)) -- [BETA] Dynamically start and stop profiling. Users can collect profiling data in real time according to the training situation, reducing the amount of data collected. -- [BETA] Profiling the communication operator time-consuming matrix. Users can find cluster communication performance bottlenecks by analyzing the communication operator time-consuming matrix. -- [BETA] Improve the performance of Ascend environment in parsing profiling data. -- [BETA] Supports offline analysis of data generated by Profiling. Users can collect data first and then parse the data as needed. -- [BETA] Supports collecting performance data of On-Chip Memory, PCIe, and l2_cache to enrich performance analysis indicators. +Previously, Training on Device use TrainSession while Inference on Device use LiteSession. To simplify implementation, we move TrainSession functions to LiteSession as virtual function. and move APIs previous defined in train_session.h to lite_session.h. -#### Dump +```cpp +class MS_API LiteSession { +... +static LiteSession *CreateTrainSession(const std::string &filename, const lite::Context *context, + bool train_mode = false, const lite::TrainCfg *cfg = nullptr); + static LiteSession *CreateTransferSession(const std::string &filename_backbone, const std::string &filename_head, + const lite::Context *context, bool train_mode = false, + const lite::TrainCfg *cfg = nullptr); +virtual int Train() { return mindspore::lite::RET_ERROR; } +virtual int Eval() { return mindspore::lite::RET_OK; } +virtual int SetupVirtualBatch(int virtual_batch_multiplier, float lr = -1.0f, float momentum = -1.0f) { + return mindspore::lite::RET_ERROR; + } +virtual std::vector GetPredictions() const { + std::vector outputs; + return outputs; + } +... +``` -- [BETA] The statistical information saved by Dump records MD5 values, and users can determine small differences in tensor values through MD5 values. -- [BETA] Dump supports the float16 data type and supports users to locate float16 type operator accuracy issues. +###### Add Export API for Training on device, obsolete SaveToFile API.([!17356](https://gitee.com/mindspore/mindspore/pulls/17356)) -#### PyNative +Previously, Training on Device uses SaveToFile API to save the training model to file. Export API was added in this release to support more format, more model type(train or interface part of the model), and save weight quant model of train. -- [STABLE] Reconstruct the single operator calling process for dynamic graphs to improve the performance of dynamic graphs. +```cpp +virtual int Export(const std::string &file_name, lite::ModelType model_type = lite::MT_TRAIN, + lite::QuantizationType quant_type = lite::QT_DEFAULT, lite::FormatType = lite::FT_FLATBUFFERS) { + return mindspore::lite::RET_ERROR; + } +``` -#### Ascend +###### Add GetFeatureMaps and UpdateFeatureMaps interface for Training on device.([!18344](https://gitee.com/mindspore/mindspore/pulls/18344)) -- [BETA] Support set configuration options of CANN, which are divided into two categories: global and session. Users can configure them through mindspore.set_context(Ascend_configuration={"ge_options": {"global": {"global_option": "option_value"}, "session": {"session option": "option_value"}}). +When Training on the device, we may need to update the model featuremap and get model featuremap.particularly in MindSpore Federated Scenario. -#### API Change +```cpp +virtual std::vector GetFeatureMaps() const { + std::vector features; + return features; + } + virtual int UpdateFeatureMaps(const std::vector &features) { return mindspore::lite::RET_ERROR; } +``` -- Add mindspore.hal API to support stream, event, and device management capabilities. -- Add mindspore.multiprocessing API to provide the capability of creating multiple processes. +#### New features -#### Operators +##### Java API -- [BETA] mindspore.ops.TopK now supports the second input k as an int32 type tensor. +###### new static method for creating LiteSession by MSConifg in LiteSession.class -### Bug Fixes +Previously, if we want to create a LiteSession object, we need to call two APIs: -- [#I92H93] Fixed the issue of 'Launch kernel failed' when using the Print operator to print string objects on the Ascend platform. -- [#I8S6LY] Fixed RuntimeError: Attribute dyn_input_sizes of Default/AddN-op1 is [const vector]{}, of which size is less than 0 error of variable-length input operator, such as AddN or Concat, for dynamic shape process in graph mode on the Ascend platform. -- [#I9ADZS] Fixed the data timeout issue in network training due to inefficient dataset recovery in the fault recovery scenario. +```js +MSConfig config; +// config options ... +LiteSession liteSession = new LiteSession(); +boolean ret = liteSession.init(config); +if (!ret) { + // handle init LiteSession failed ... +} +``` -### Contributors +now we can create a LiteSession object with new API just like: -Thanks goes to these wonderful people: +```js +MSConfig config; +// config options ... +LiteSession liteSession = createSession(config); +if (liteSession == null) { + // handle create LiteSession failed ... +} +``` -AlanCheng511,AlanCheng712,bantao,Bingliang,BJ-WANG,Bokai Li,Brian-K,caifubi,cao1zhg,CaoWenbin,ccsszz,chaiyouheng,changzherui,chenfei_mindspore,chengbin,chengfeng27,chengxb7532,chenjianping,chenkang,chenweifeng,Chong,chuht,chujinjin,Cynthia叶,dairenjie,DavidFFFan,DeshiChen,douzhixing,emmmmtang,Erpim,fangzhou0329,fary86,fengxun,fengyixing,fuhouyu,gaoshuanglong,gaoyong10,GaoZhenlong,gengdongjie,gent1e,Greatpan,GTT,guoqi,guoxiaokang1,GuoZhibin,guozhijian,hangq,hanhuifeng,haozhang,hedongdong,hejianheng,Henry Shi,heyingjiao,HighCloud,Hongxing,huandong1,huangbingjian,HuangLe02,huangxinjing,huangziling,hujiahui8,huoxinyou,jiangchenglin3,jianghui58,jiangshanfeng,jiaorui,jiaxueyu,JichenZhao,jijiarong,jjfeing,JoeyLin,JuiceZ,jxl,kairui_kou,kate,KevinYi,kisnwang,lanzhineng,liangchenghui,LiangZhibo,lianliguang,lichen,ligan,lihao,limingqi107,ling,linqingke,liruyu,liubuyu,liuchao,liuchengji,liujunzhu,liuluobin,liutongtong9,liuzhuoran2333,liyan2022,liyejun,LLLRT,looop5,luochao60,luojianing,luoyang,LV,machenggui,maning202007,Margaret_wangrui,MaZhiming,mengyuanli,MooYeh,moran,Mrtutu,NaCN,nomindcarry,panshaowu,panzhihui,PingqiLi,qinzheng,qiuzhongya,Rice,shaojunsong,Shawny,shenwei41,shenyaxin,shunyuanhan,silver,Songyuanwei,tangdezhi_123,tanghuikang,tan-wei-cheng,TingWang,TronZhang,TuDouNi,VectorSL,WANG Cong,wang_ziqi,wanghenchang,wangpingan,wangshaocong,wangtongyu6,weiyang,WinXPQAQ,wtcheng,wudawei,wujiangming,wujueying,wuweikang,wwwbby,XianglongZeng,xiaosh,xiaotianci,xiaoxin_zhang,xiaoxiongzhu,xiaoyao,XinDu,xingzhongfan,yanghaoran,yangluhang,yangruoqi713,yangzhenzhang,yangzishuo,yanjiaming,Yanzhi_YI,yao_yf,yefeng,yeyunpeng2020,yide12,YijieChen,YingLai Lin,YingtongHu,youshu,yuchaojie,YuJianfeng,zangqx,zby,zhaiyukun,zhangdanyang,zhanghaibo,zhanghanLeo,zhangminli,zhangqinghua,zhangyanhui,zhangyifan,zhangyinxia,zhangyongxian,ZhangZGC,zhanzhan,zhaoting,zhengyafei,zhengzuohe,ZhihaoLi,zhouyaqiang0,zhuguodong,zhumingming,zhupuxu,zichun_ye,zjun,zlq2020,ZPaC,zuochuanyong,zyli2020,陈宇,代宇鑫,狄新凯,范吉斌,冯一航,胡彬,宦晓玲,黄勇,康伟,李良灿,李林杰,刘崇鸣,刘力力,刘勇琪,吕浩宇,没有窗户的小巷,王禹程,吴蕴溥,熊攀,徐安越,徐永飞,许哲纶,俞涵,张峻源,张树仁,张王泽,张栩浩,郑裔,周莉莉,周先琪,朱家兴,邹文祥 +###### new static method for creating LiteSession byModelBuffer and MSConfig in LiteSession.class -Contributions of any kind are welcome! +Previously, if we want to inference a model, we need to call APIs like: -## MindSpore 2.2.13 Release Notes +```js +MSConfig config; +// config options ... +LiteSession liteSession = new LiteSession(); +boolean initSessionRet = liteSession.init(config); +if (!initSessionRet) { + // handle init LiteSession failed and return ... +} +Model model = new Model(); +boolean loadModelRet = model.loadModel(modelMappedByteBuffer); +if (!loadModelRet) { + // handle load model failed and return ... +} +boolean compileModelRet = liteSession.compileGraph(model); +if (!loadModelRet) { + // handle compile model failed and return ... +} +model.free(); +// liteSession is ready to inference model, call runGraph in LiteSession.class ... +``` -### API Change +now we can use new API just like: -Add timeout environment variables in dynamic networking scenarios: +```js +MSConfig config; +// config options ... +LiteSession liteSession = createSession(modelMappedByteBuffer, config); +if (liteSession == null) { + // handle init LiteSession failed and return ... +} +// liteSession is ready to inference model, call runGraph in LiteSession.class ... +``` -- `MS_TOPO_TIMEOUT`: Cluster networking phase timeout time in seconds. -- `MS_CLUSTER_RETRY_NUM`: Number of node's retrying registration during cluster networking phase. -- `MS_NODE_TIMEOUT`: Node heartbeat timeout in seconds. -- `MS_RECEIVE_MSG_TIMEOUT`: Node timeout for receiving messages in seconds. +New createSession method is an API that integrates four old APIs: LiteSession.init, Model.loadModel, LiteSession.compileGraph and model.free. It is simple and efficient as it reduces one modelBuffer copy operation. -### Bug Fixes +###### new methods getFeaturesMap and updateFeatures for in LiteSession.class -- [#I9CR96] Fix the issue of insufficient timeout time causing failure for dynamic networking startup in large-scale clusters. +Recently, we add a new C++ api in LiteSession class, Correspondingly we add a new java API in LiteSession.java. -### Contributors +```java +public List getFeaturesMap() { + List ret = this.getFeaturesMap(this.sessionPtr); + ArrayList tensors = new ArrayList(); + for (Long msTensorAddr : ret) { + MSTensor msTensor = new MSTensor(msTensorAddr); + tensors.add(msTensor); + } + return tensors; + } + public boolean updateFeatures(List features) { + long[] inputsArray = new long[features.size()]; + for (int i = 0; i < features.size(); i++) { + inputsArray[i] = features.get(i).getMSTensorPtr(); + } + return this.updateFeatures(this.sessionPtr, inputsArray); + } +``` -Thanks goes to these wonderful people: +###### new methods export to replace saveToFile API in LiteSession.class -ZPaC, limingqi107, lizhenyu, jiangshanfeng +Recently, we add a new C++ api in LiteSession class, Correspondingly we add a new java API in LiteSession.java. -Contributions of any kind are welcome! +```java +public boolean export(String modelFileName, int modelType, int quantizationType) { + return this.export(this.sessionPtr, modelFileName, modelType, quantizationType); + } +``` -## MindSpore 2.2.12 Release Notes +###### new train related API moved to LiteSession.class from TrainSession.class -### Major Features and Improvements +Align with update of C++ api in LiteSession class, add new java API to LiteSession.java Correspondingly. + +```java +public class LiteSession { +... +public static LiteSession createTrainSession(String modelName, final MSConfig config, boolean trainMode){...} +public boolean train() {...} +public boolean eval() {...} +... +``` -- [STABLE] Optimize scnarios where network parameters are initialized by fp32, and optimizer parallel mode is on, reducing the amount of Cast operator. -- [STABLE] Add detection and processing capabilities to silent fault detection. Silent faults may lead to error during training procedures, this helps users to prevent or lower the cost of fault location, which caused by silent faults. +### Bug fixes -### Bug Fixes +1. Fix the bug that the train session does not release memory cause of refcount bug. -- [#I97D1L] Fix ReduceLROnPlateau, LRScheduler, CosineAnnealingWarmRestarts dynamic learning rate related interface sample error. -- [#I970HV] Fix the problem where order of AllGather/ReduceScatter between two cards is not preserved. -- [#I99JPI] Fix load checkpoint for bfloat16 parameter during vague load mode. +#### Deprecations ### Contributors Thanks goes to these wonderful people: -yao_yf, YijieChen, 冯一航, yuchaojie, 李良灿, YuJianfeng, huangxinjing, GuoZhibin, looop5 +Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, Zhenglong Li, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. Contributions of any kind are welcome! -## MindSpore 2.2.11 Release Notes +# MindSpore 1.2.1 + +## MindSpore 1.2.1 Release Notes ### Major Features and Improvements -#### scipy +#### FrontEnd -- [STABLE] Add new API mindspore.scipy.optimize.linear_sum_assignment in scipy module to solve the linear sum assignment problem. It can find the least-cost assignment based on a given cost matrix. +- [STABLE] Add MaskedSelect aicpu operation.(Ascend) -### Bug Fixes +#### Auto Parallel -- [#I8JVRU] Fixed the problem where the results of the bernoulli random operator running twice on the GPU are probabilistically consistent. -- [#I8OC32] Fixed the segmentation fault error because the MatrixSetDiagV3 operator does not verify abnormal input. +- [STABLE] Support distributed checkpoint loading.(Ascend/GPU) -### Contributors +## MindSpore Lite 1.2.0 -Thanks goes to these wonderful people: +### Major Features and Improvements -fary86, wanghenchang, haozhang, mengyuanli, emmmmtang, luoyang, zhupuxu, zhangyongxian, liuluobin, LLLRT, TuDouNi, hujiahui8, wangtongyu6, ligan, zhuguodong, yanghaoran, YingtongHu, liyejun, zjun, 徐永飞, chuht, 张树仁, 徐安越, DeshiChen, shenyaxin, liujunzhu, shunyuanhan, yuchaojie, yao_yf, 没有窗户的小巷, yeyunpeng2020, weiyang, KevinYi, hedongdong, zhouyaqiang0, Margaret_wangrui, zhanghaibo, moran, huangziling, 朱家兴, GuoZhibin, 李良灿, jiaxueyu, gaoyong10, Greatpan, 宦晓玲, melody, 俞涵, jiangshanfeng, XinDu, ling, caifubi, zhangyinxia, gengdongjie, Erpim, XianglongZeng, zhangminli, fengyixing, 冯一航, 黄勇, panzhihui, 胡彬, linqingke, wangshaocong +#### Converter and runtime -Contributions of any kind are welcome! +1. Support TensorFlow model in Converter except aware-training model. +2. Add fusion pattern for same horizontal operators in Converter. +3. Support Jar in x86_64 system for integrating into server with Java backend conveniently. +4. Provide unified runtime API for developer reusing their code between cloud side and end side.[BETA] +5. Improve control-flow capabilities continually: Support GRU fusion in Converter; Support weight-quant for control-flow model; Support control-flow model inference with half precision; Support nested control-flow model.[BETA] -## MindSpore Lite 2.2.11 Release Notes +#### ARM backend optimization -### Bug Fixes +1. Add NLP dependent float16 operators(like lstm) to enhance inference performance. +2. Optimize operators: lstm, gru, depthwise. +3. Add 6 NPU operators(like FullConnection), and fix some bugs about buildIR failed. -- [#I8TPLY] Fixed SSD MobileNetV2 FPN network inference error on Atlas inference series products. +#### OpenCL backend -### Contributors +1. Add new ops: add 10+ ops,total 72 ops; +2. Performance optimization: by memory layout optimize,block tiling,Performance improved by 30% compared to version 1.1 at Adreno GPU. +3. Initialization time optimization: initialization time improve 100% vs MSLITE Version1.1 by store kernel cache as binary. +4. Support Java call on Mali or Adreno GPU. -Thanks goes to these wonderful people: +#### Post quantization -wangtongyu6, zhuguodong, 徐永飞, 徐安越, yeyunpeng2020, moran, XinDu, gengdongjie. +1. Support quantization of gather and lstm ops. +2. Support quantizatizing TF Lite models with sub-graph node. +3. Add quantiztion strategy to decide quantize ops or not,less accuracy loss and higher compression rate. -Contributions of any kind are welcome! +#### Training on Device -## MindSpore 2.2.10 Release Notes +1. Virtual batching, use mini-batch to minic large batch in theorical with few RAM consumption. +2. Converter unify, do not compile tod and iod converter separately. +3. Performance optimization to BWD ops. +4. TrainLoop with Off-The-Shelf Functionality blocks, like LR scheduler, Loss Monitor, Ckpt Saver, Accuracy Monitor. +5. Integration of code with Minddata lite. +6. Support more networks (googlenet, densenet, shufflenetv2, nin, vgg) and operators. -### Major Features and Improvements +#### Codegen -#### Operators +1. Support 79 ops for the ARM platform and all CMSIS ops for Arm Cortex-M Series. +2. Multiplatform support, including Android, IoT Devices. +3. Support offline model weight preprocessing while compiling. +4. Support offline memory reuse computing for minimum runtime buffer size. +5. Support kernel register for custom op. Third-party hardware like NNIE can be accessed through it. -- [STABLE] FastGelu, BatchMatMul, AllReduce, AllGather, Broadcast, ReduceScatter support bfloat16 data type -- [STABLE] AllGather support uint8 data type +### API Change -### Bug Fixes +#### API Incompatible Change -- [#I8ALW3] Fixed networks including Faster R-CNN, DeepText, MaskRCNN-ResNet50, which had errors while training RandomChoiceWithMask operator in Ascend 8P scenario. -- [#I8LKG7] Fixed graph compilation error of UNet-2D in Ascend 1P/8P scenario. -- [#I8KU3X] Fixed CRNN-ResNet34 network, which stuck in training phase in Ascend 1P/8P PyNative mode. -- [#I8KTHH] Fixed BERT network error when training without allreduce grouped fusion with enable_parallel_optimizer=True, in Ascend 8P scenario. +##### C++ API -### Contributors +###### Add header file named lite_types.h for some common data structs. ([!12262](https://gitee.com/mindspore/mindspore/pulls/12262)) -Thanks goes to these wonderful people: +Previously, some common data structs such as `CpuBindMode` and `DeviceType` are in context.h, this may cause cross-dependency between headers. So we create a new header named lite_types.h for some common data structs and move `CpuBindMode` and `DeviceType` from context.h into lite_types.h. -李林杰, TuDouNi, chengxb7532, Henry Shi, rms-infer-type, 朱家兴, zhouyaqiang0, tanghuikang, gaoyong10, gengdongjie, yao_yf, hujiahui8, hanhuifeng, shenyaxin, KevinYi, 冯一航, chengfeng27, JuiceZ, zhangyanhui, jijiarong, xiaoxiongzhu, 没有窗户的小巷, ling, liyan2022, haozhang, zangqx, xiaoyao, liujunzhu, 胡彬, panzhihui, wangshaocong, linqingke, jianghui58, qiuzhongya, yangruoqi713, zhangminli, moran, 王禹程, shaojunsong, wangtongyu6, zhupuxu, luoyang, 徐安越, qinzheng, caifubi, 徐永飞, chenkang, youshu, XinDu, liubuyu, jxl, yeyunpeng2020, huoxinyou, yefeng, jiaorui, wangpingan, cao1zhg, zjun, zyli2020, yanjiaming, Cynthia叶, 胡安东, 李良灿, liruyu, liuluobin, lihao, huangbingjian, YijieChen, jjfeing, looop5, 刘力力, xiaoxin_zhang, yangluhang, chenweifeng, jiangshanfeng, zichun_ye, 陈宇, NaCN, ligan, YingLai Lin, huangziling, chenjianping, DeshiChen, chengbin, kairui_kou, ccsszz, yanghaoran, zhangdanyang, Yanzhi_YI, zhengzuohe, hangq, TronZhang, wanghenchang, HighCloud, 吕浩宇, VectorSL, ZPaC, mengyuanli, maning202007, 刘勇琪, r1chardf1d0, fary86, 刘崇鸣, yuchaojie, douzhixing, fengyixing + + + + + + + +
lite_types.h
-Contributions of any kind are welcome! +```cpp +namespace mindspore::lite { +/// \brief CpuBindMode defined for holding bind cpu strategy argument. +typedef enum { + NO_BIND, /**< no bind */ + HIGHER_CPU, /**< bind higher cpu first */ + MID_CPU /**< bind middle cpu first */ +} CpuBindMode; -## MindSpore Lite 2.2.10 Release Notes +/// \brief DeviceType defined for holding user's preferred backend. +typedef enum { + DT_CPU, /**< CPU device type */ + DT_GPU, /**< GPU device type */ + DT_NPU /**< NPU device type */ +} DeviceType; +} // namespace mindspore::lite +``` -### Bug Fixes +
-- [#I8K7CC] Optimize error message when non-string segments are passed to get_model_info. +###### Add some new interfaces in ms_tensor.h for unified runtime API.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) -### Contributors +Previously, users could not create `MSTensor` or modify ``MSTensor, all `MSTensor` are created and managed by framework. However users need to create or modify MSTensor sometimes such as pre-processing input data. So we provide two new interfaces in ms_tensor.h: `CreateTensor` interface for creating `MSTensor` by user and `set_shape` interface for modifying the shape of `MSTensor`. -Thanks goes to these wonderful people: + + + + + + + +
CreateTensor
-gengdongjie, zhangyanhui, xiaoxiongzhu, wangshaocong, jianghui58, moran, wangtongyu6, 徐安越, qinzheng, 徐永飞, youshu, XinDu, yeyunpeng2020, yefeng, wangpingan, zjun, 胡安东, 刘力力, 陈宇, chenjianping, kairui_kou, zhangdanyang, hangq, mengyuanli, 刘崇鸣 +```cpp +/// \brief Create a MSTensor. +/// +/// \return Pointer to an instance of MindSpore Lite MSTensor. +static MSTensor *CreateTensor(const std::string &name, TypeId type, const std::vector &shape, const void *data, + size_t data_len); +``` -Contributions of any kind are welcome! - -## MindSpore 2.2.1 Release Notes - -### Bug Fixes - -- [#I7R3R5] Fixed the problem that the network precision of the ResNet-50 on the Ascend platform deteriorates. -- [#I8A9RH] Fixed an issue where the DBNet(ResNet-50) network precision on the Ascend platform deteriorates. -- [#I8B8IW] Fixed the segment error caused by out-of-bounds multi-dimensional tensor assignment. -- [#I8J0F4] Fixed an issue where the multidimensional Tensor extension dimension fails to be executed in the dynamic graph. -- [#I87P3P] Fixed an issue where the compilation cache fails to be loaded during secondary training on the Ascend platform. -- [#I86GP9] Fixed an issue where the UNet3D network inference precision deteriorates on the Ascend platform. -- [#I89B4K] Fixed an issue where the dynamic rank execution of dynamic graphs on the Windows platform is suspended. -- [#I8CX0C] Fixed an issue where dynamic images occasionally fail in mixed precision mode on the Ascend platform. -- [#I8BGCF] Fixed an issue where a segment error occurs when the command is executed in dynamic diagram mode of the AirNet network on the Ascend platform. -- [#I8L5DS] Fixed an issue where the ResNet-50 image segmentation network dynamic image is executed slowly on the Ascend platform. - -### Contributors - -Thanks goes to these wonderful people: - -yufan, dingcheng, lvzhangcheng, zhunaipan, fangwenyi, weiyang, changzherui, chujinjin, zangqingxiang, yuchaojie, wuweikang, tanghuikang, xiaoyao, huangbinjian, zhoupeichen, chenfei_mindspore, hedongdong, wangnan, zhengzuohe, yanghaoran, zouliqin, luoyang, liuchongmin, lujiale, machenggui, wangcong, lixiangyi, wangting, huangyong - -Contributions of any kind are welcome! - -## MindSpore Lite 2.2.1 Release Notes - -### Bug Fixes - -- [#I88055] Fixed a function issue caused by incorrect format setting of the gridsample operator in MindSpore Lite inference. -- [#I8D80Y] The MindSpore Lite inference single-operator invoking process resources are not released and exits abnormally. - -### Contributors - -Thanks goes to these wonderful people: - -zhanghaibo, wangsiyuan, wangshaocong, chenjianping - -Contributions of any kind are welcome! - -## MindSpore 2.2.0 Release Notes - -### Major Features and Improvements - -#### DataSet - -- [STABLE] The `row_size` parameter of data operation map/batch is extended to support passing list, which stands for [Input Shared Memory, Output Shared Memory], so as to flexibly control the size of shared memory in multi-process mode. -- [STABLE] Provide 100% mindspore.dataset and mindspore.dataset.transforms samples for reference. -- [STABLE] ConcatDataset supports global sampling. After combining data from multiple sources using concat operation, data can be globally sampled randomly to enhance data diversity. -- [STABLE] When the model.train API is used for training, TimeMonitor(.., data_time=True) can be used to monitor data processing performance in real time. -- [STABLE] Introduced the jemalloc library to solve the problem of slow memory rise due to untimely memory debris recovery in extreme scenarios. - -#### FrontEnd - -- [STABLE] Support adding decorator @lazy_inline to make a graph generated from cell being inlined lazily, which can improve the compilation performance effectively. -- [STABLE] Optimize the function of mixed precision training, support automatic rewriting of Python scripts through rewrite to achieve mixed precision strategies, and support automatic parsing of functions, branch statements, and other syntax. -- [STABLE] Mixed precision function optimization, ReWrite supports syntax parsing of class functions and branch statements, and extends O1 functionality. -- [STABLE] Optimize the dynamic learning rate function and add APIs such as MultiStepLR; function get_lr and global_step decoupling, extending optimizer module functionality. -- [STABLE] Optimize API code samples, API difference tables, and tutorials for using higher-order functions. - -#### Operator - -- [STABLE] Add new operator primitive `mindspore.ops.Dense`. -- [STABLE] Add the random number operator state management feature, which allows the random number operator to save the state of the random number, and can be stably reproduced in scenarios such as model parallelism and recalculation. Currently, it only supports CPU/GPU platforms, and the involved random number operators include: `mindspore.ops.Multinomial`, `mindspore.ops.MultinomialWithReplacement`, `mindspore.ops.ParameterizedTruncatedNormal`, `mindspore.ops.StandardLaplace`, `mindspore.ops.StandardLaplace`, `mindspore.ops.Uniform`, `mindspore.ops.UniformInt`, `mindspore.ops.UniformReal`, `mindspore.ops.UniformInt`, `mindspore.ops.Dropout`, `mindspore.ops.RandomChoiceWithMask`, `mindspore.ops.RandomCategorical`, `mindspore.ops.RandomShuffle`, `mindspore.ops.RandamGamma`, `mindspore.ops.RandomPoisson` and `mindspore.ops.TruncatedNormal`. -- [STABLE] When a GPU operator encounters an illegal input scenario, it supports asynchronously printing error logs in the CUDA kernel of the operator to the Host side and interrupting the execution of the current CUDA Stream, improving the efficiency of user operator problem positioning. - -#### PyNative - -- [STABLE] Support viewing mechanism in PyNative mode. -- [STABLE] Function enhancement in PyNative mode: sens supports dict input type. - -#### Ascend - -- [STABLE] Supports user configurable operator high-precision/high-performance mode, users can use `context.set_context(ascend_config={"op_precision_mode": "/path/to/op_precision_config_file"})` to configure high-precision/high-performance modes for some TBE operators. -- [BETA] Supports user configurable operators for fp16-in and fp32-out, users can use `context.set_context(ascend_config={"precision_mode": "force_fp32"})` to configure fp16-in and fp32-out for the TBE Cube operators. -- [BETA] Remove the strong binding between `jit_level="O3"` and GE processes, so users no longer need to set `jit_level="O3"` when executing GE processes. - -#### Parallel - -- [STABLE] Support the gradient accumulation feature in non-pipeline parallel scenarios in semi-automatic/fully automatic mode. Users can enable gradient accumulation by writing `net = GradAccumulationCell(net, micro_size)`. The gradient accumulation feature is compatible with the lazy_inline feature. - -#### Inference - -Since version 2.2, the MindSpore main release package does not provide the inference interface enabling for the Ascend 310. If you need to use the inference interface, install the MindSpore Lite release package or download the MindSpore version earlier than 2.0. For details about how to install and use MindSpore Lite, see . HUAWEI Ascend 310 (Ascend) is an energy-efficient and highly integrated AI processor for edge scenarios. It supports inference on MindIR models. In the earlier version, MindSpore provides two methods for enabling inference on the Ascend 310 hardware: - -1. The MindSpore main release package provides the matching Ascend 310 version that supports C++ inference interfaces. -2. The MindSpore Lite release package provides the matching Ascend version and supports C++ and Java inference. - -The C++ APIs provided by the two solutions are basically the same. In the future, MindSpore Lite is used instead of building and maintaining two sets of interfaces. The original 310 inference service built based on the MindSpore main release package can be switched to MindSpore Lite with a few modifications. For details, see . - -### Bug fixes - -- [I7SDA0] Fixed an issue where the accuracy of the CRNN network deteriorates on the NES platform. -- [I7T4QK] Fixed an issue where the inference precision of the WGAN network deteriorates on the OptiX OSN 8800 platform. -- [I7TJ8Z] Fixed an issue where the inference precision of the LGTM network deteriorates on the OptiX OSN 8800 platform. -- [I7M58O] Fixed ASR-dynamic network training core dump issue on Ascend platform. -- [I7L6B6] Fixed an issue where child processes do not exit in some scenarios when dataset is in multi-process mode. -- [I7L7AE] Fixed an issue where dataset pipeline contains repeat operations and dynamic batchinfo.get_epoch_num() is incorrectly used in dataset.batch. -- [I7UY7G] Rectify the file permission modification error in OBSMindDataset. - -### Contributors - -Thanks goes to these wonderful people: -bantao, Bingliang, BJ-WANG, Brian-K, caifubi, ccsszz, changzherui, chenfei_mindspore, chengfeng27, chenhaozhe, chenjianping, chenkang, chenweifeng, chuht, chujinjin, CShu0507, Cynthia叶, DeshiChen, douzhixing, Erpim, Etienne, fary86, fengxun, fengyixing, gaoshuanglong, Gaoxiong, gaoyong10, GaoZhenlong, Greatpan, GuoZhibin, guozhijian, hangq, hanhuifeng, haozhang, hedongdong, Henry Shi, HighCloud, Hongxing, huangbingjian, huanghui, huangxinjing, huangziling, hujiahui8, huoxinyou, HWalkingMan, jianghui58, jiangshanfeng, jiaorui, jijiarong, jjfeing, JuiceZ, jxl, KevinYi, kisnwang, KXiong, lanzhineng, Li Qingguo, LiangZhibo, lianliguang, ligan, lihao, Lihoon, limingqi107, ling, linqingke, liruyu, liubuyu, liuchao, liujunzhu, liuluobin, liupeng303, liutongtong9, liyan2022, liyejun, looop5, luochao60, luojianing, luoyang, machenggui, maning202007, Margaret_wangrui, MaZhiming, mengyuanli, moran, NaCN, nomindcarry, panshaowu, panzhihui, qinzheng, qiuzhongya, r1chardf1d0, shaojunsong, shenwei41, shenyaxin, shenzhangyi, Shira Zaloshinski, shunyuanhan, tangdezhi_123, tanghuikang, tan-wei-cheng, tan-wei-cheng-3260, TronZhang, TuDouNi, VectorSL, wang_ziqi, wanghenchang, wangpingan, wangshaocong, wangtongyu6, wtcheng, wujueying, XianglongZeng, xiaotianci, xiaoxin_zhang, xiaoxiongzhu, xiaoyao, xiaoyuanyuan, XinDu, xujinliang, xupan, yanghaoran, yangluhang, yangruoqi713, yangsijia, yangzhenzhang, yangzishuo, yanjiaming, Yanzhi_YI, yao_yf, yefeng, yeyunpeng2020, yide12, YijieChen, YingLai Lin, YingtongHu, yonibaehr, youshu, yuchaojie, YuJianfeng, zangqx, zhaizhiqiang, zhangbuxue, zhangchunlei, zhangdanyang, zhangdong, zhanghaibo, zhangminli, zhangqi, zhangqinghua, zhangyanhui, zhangyifan, zhangyongxian, zhangzhen, zhangzheng, zhanzhan, zhengzuohe, ZhihaoLi, zhoufeng, zhouyaqiang0, zhuguodong, zhupuxu, zichun_ye, zjun, ZPaC, zuochuanyong, zyli2020, 陈宇, 程超, 范吉斌, 冯浩, 冯一航, 胡彬, 宦晓玲, 黄勇, 雷元哲, 黎冠新, 李良灿, 李林杰, 刘崇鸣, 刘力力, 刘思铭, 刘勇琪, 吕浩宇, 没有窗户的小巷, 沈竞兴, 王禹程, 王振邦, 徐安越, 徐永飞, 俞涵, 张澍坤, 周超, 朱家兴 - -Contributions of any kind are welcome! - -## MindSpore Lite 2.2.0 Release Notes - -### Major Features and Improvements - -#### FlashAttention Operator Fusion - -- [STABLE] The OptiX OSN Ascend series supports the FlashAttention large operator fusion of the LLAMA and stable diffusion models. - -## MindSpore 2.1.1 Release Notes - -### Bug fixes - -- [I7Q9RX] The Ascend platform supports adaptive identification of different hardware types. -- [I7SDA0] Fixed an issue where the accuracy of the CRNN network deteriorates on the NES platform. -- [I7T4QK] Fixed an issue where the inference precision of the WGAN network deteriorates on the OptiX OSN 8800 platform. -- [I7TJ8Z] Fixed an issue where the inference precision of the LGTM network deteriorates on the OptiX OSN 8800 platform. - -### Contributors - -Thanks goes to these wonderful people: - -changzherui, chenfei_mindspore, chenjianping, chenkang, chenweifeng, chujinjin, fangwenyi, GuoZhibin, guozhijian, hangq, hanhuifeng, haozhang, hedongdong, You Shu, Zhou Feng, Dai Yuxin - -Contributions of any kind are welcome! - -## MindSpore Lite 2.1.1 Release Notes - -### Major Features and Improvements - -- [STABLE] MindSpore Lite Cloud Inference adds support for Python 3.8 and Python 3.9 - -## MindSpore 2.1.0 Release Notes - -### Major Features and Improvements - -#### FrontEnd - -- [BETA] JIT Fallback supports variable scenarios. In static graph mode, JIT Fallback supports return of Dict type and Scalar type, supports property setting of non-Parameter type objects, supports partial in-place modification operations of List, and supports third-party libraries such as NumPy. Moreover, it supports related operations of user-defined classes and supports Python basic operators and built-in functions to use more data types. It is compatible with features like control flow, side effects, automatic differentiation. For more details, please refer to [Static Graph Syntax Support](https://www.mindspore.cn/docs/en/r2.1/note/static_graph_syntax_support.html). - -- [BETA] In static graph mode, the error message of using undefined variables in the control flow scene is optimized. When using variables defined in if, while, and for control flow branches, the variables need to be initialized and defined before the control flow. - -- [STABLE] Add module ReWrite, support the ability to modify multiple network in batches based on customized rules. - -- [BETA] Add optim_ex module for optimizers, extend the current functionality, support parameter grouping for every parameter in the optimizer, and support parameter modification by assignment while training. - -- [STABLE] Optimize PyTorch and MindSpore API Mapping Table, specify the differences between APIs among functionality, parameter, input, output and specialized cases. - -#### PyNative - -- Optimize the performance of dynamic shape scenes in PyNative mode. - -#### DataSet - -- [STABLE] Optimize the memory structure of MindRecord data files. Memory consumption can be reduced 60% when loading 100TB+ data for training. -- [STABLE] Support single-thread execution of data processing pipeline, and users can add code in the data pipeline for debugging. -- [STABLE] Optimize the performance of TFRecordDataset to improve the performance of dataset loading by 60%+. Optimize the performance of batch to improve the performance by 30% for the scenarios with large number of batch. -- [STABLE] Optimize API documentation of [mindspore.dataset](https://www.mindspore.cn/docs/en/r2.1/api_python/mindspore.dataset.html) and [mindspore.dataset.transforms](https://www.mindspore.cn/docs/en/r2.1/api_python/mindspore.dataset.transforms.html). Four new sample libraries have been added to show the effect of data enhancement, namely: [Load & Process Datasets Using Data Pipeline](https://www.mindspore.cn/docs/en/r2.1/api_python/mindspore.dataset.html#quick-start-of-dataset-pipeline), [Visual Transformation Sample Library](https://www.mindspore.cn/docs/en/r2.1/api_python/mindspore.dataset.transforms.html#module-mindspore.dataset.vision), [Text Transform Sample Library](https://www.mindspore.cn/docs/en/r2.1/api_python/mindspore.dataset.transforms.html#module-mindspore.dataset.text), [Audio Transform Sample Library](https://www.mindspore.cn/docs/en/r2.1/api_python/mindspore.dataset.transforms.html#module-mindspore.dataset.audio) - -#### AutoParallel - -- [STABLE] Support offload parameters or intermediate activations to the CPU or NVMe storage during training process. Users can enable this offload feature by configuring context to scale up the trainable model size. - -- [STABLE] Enhanced automatic parallel capability including: - - 1. Performance of automatic strategy for typical networks is no less than 90% of default configuration. - - 2. Support 3D hybrid parallel training: automatic operator-level strategy generation combined with manual configured pipeline partition. - -#### Runtime - -- [STABLE] Upgrade OpenMPI version to 4.1.4. -- [STABLE] Upgrade NCCL version to 2.16.5. -- [STABLE] Assign rank id continuously in same node when using dynamic cluster to launch distributed jobs. -- [STABLE] No adaptation code is required for Scheduler node. The script of Scheduler could be identical to that of Worker. - -#### Ascend - -- [STABLE] Support dump assisted debug information for operator AIC Error scenario. The information includes the operator task name, stream ID, input/output/workspace address and so on. -- [STABLE] Provide default processing mechanism, which skips its execution, for CANN operators for empty Tensor output scenarios. -- [STABLE] Supplement debug information when network model fails to execute in graph mode. The debug information will saved in a CSV file in rank_${id}/exec_order/, recording the task ID and stream ID of each task. - -#### Profiler - -- [STABLE] The Profiler supports the collection of time-consuming data from all phases on the Host side. -- [BETA] The Profiler supports the collection of memory data from all phases on the Host side. -- [BETA] The Profiler supports the collection of data processing operator time consumption. - -### API Change - -- `mindspore.dataset.GraphData`, `mindspore.dataset.Graph`, `mindspore.dataset.InMemoryGraphDataset`, `mindspore.dataset. ArgoverseDataset` are no longer evolved and are deprecated. Use [MindSpore Graph Learning](https://gitee.com/mindspore/graphlearning) for related functional replacements. When replacing networks in Model repositories that use this API, please refer to [GCN](https://gitee.com/mindspore/graphlearning/tree/master/model_zoo/gcn) for GCN and [GAT](https://gitee.com/mindspore/graphlearning/tree/master/model_zoo/gat). -- `mindspore.set_context` adds `jit_syntax_level` option, which is used to set JIT syntax support level. For more details, please refer to [set_context](https://www.mindspore.cn/docs/en/r2.1/api_python/mindspore/mindspore.set_context.html). -- Change the `model.infer_predict_layout` interface, which has a new parameter skip_backend_compile with a default value of False. Set to True when the user wants to skip the backend compilation process to get the parameter slicing strategy. - -#### Operators - -- Add operator primitive for `mindspore.ops.ApplyAdamWithAmsgradV2`. It is recommended to call this operator through API `mindspore.nn.Adam`. -- Add operator primitive for `mindspore.ops.UpsampleTrilinear3D`. It is recommended to call this operator through API `mindspore.ops.interpolate`. -- Add operator primitive for `mindspore.ops.UpsampleNearest3D`. It is recommended to call this operator through API `mindspore.ops.interpolate`. - -#### API Deprecation - -- Deprecate operator primitive `mindspore.ops.ScatterNonAliasingAdd`. It is recommended to use operator primitive `mindspore.ops.TensorScatterAdd` as a replacement. - -#### Backwards Incompatible Change - -- Interface name: `mindspore.nn.Dense`, `mindspore.nn.Conv1d`, `mindspore.nn.Conv1dTranspose`, `mindspore.nn.Conv2d`, `mindspore.nn.Conv2dTranspose`, `mindspore.nn.Conv3d`, `mindspore.nn.Conv3dTranspose` - - Changes: Change initialization parameter strategy. The default value of weight_init is changed from "normal" to None, and the default value of bias_init is changed from "zeros" to None. - - Description: The default initialization method for weights has been changed from "normal" to internal HeUniform initialization. The default initialization method of bias is changed from "zeros" to internal Uniform initialization. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Original interface v2.1 interface
-  mindspore.nn.Dense(in_channels,
-                     out_channels,
-                     weight_init='normal',
-                     bias_init='zeros',
-                     has_bias=True,
-                     activation=None)
-  
-
-  mindspore.nn.Dense(in_channels,
-                     out_channels,
-                     weight_init=None,
-                     bias_init=None,
-                     has_bias=True,
-                     activation=None)
-  
-
-  mindspore.nn.Conv1d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init='normal',
-                      bias_init='zeros')
-  
-
-  mindspore.nn.Conv1d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init=None,
-                      bias_init=None)
-  
-
-  mindspore.nn.Conv1dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               dilation=1,
-                               group=1,
-                               has_bias=False,
-                               weight_init='normal',
-                               bias_init='zeros')
-  
-
-  mindspore.nn.Conv1dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               dilation=1,
-                               group=1,
-                               has_bias=False,
-                               weight_init=None,
-                               bias_init=None)
-  
-
-  mindspore.nn.Conv2d(in_channels,
-                      out_channels, kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init='normal',
-                      bias_init='zeros',
-                      data_format='NCHW')
-  
-
-  mindspore.nn.Conv2d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init=None,
-                      bias_init=None,
-                      data_format='NCHW')
-  
-
-  mindspore.nn.Conv2dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               output_padding=0,
-                               dilation=1,
-                               group=1,
-                               has_bias=False,
-                               weight_init='normal',
-                               bias_init='zeros')
-  
-
-  mindspore.nn.Conv2dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               output_padding=0,
-                               dilation=1,
-                               group=1,
-                               has_bias=False,
-                               weight_init=None,
-                               bias_init=None)
-  
-
-  mindspore.nn.Conv3d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init='normal',
-                      bias_init='zeros',
-                      data_format='NCDHW')
-  
-
-  mindspore.nn.Conv3d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init=None,
-                      bias_init=None,
-                      data_format='NCDHW')
-  
-
-  mindspore.nn.Conv3dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               dilation=1,
-                               group=1,
-                               output_padding=0,
-                               has_bias=False,
-                               weight_init='normal',
-                               bias_init='zeros',
-                               data_format='NCDHW')
-  
-
-  mindspore.nn.Conv3dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               dilation=1,
-                               group=1,
-                               output_padding=0,
-                               has_bias=False,
-                               weight_init=None,
-                               bias_init=None,
-                               data_format='NCDHW')
-  
-
- -### Bug Fixes - -- [I6TKLW] Fix the issue of MobileNetV2 network performance degradation on the Ascend platform. -- [I7CP5H] Fix the issue where ASR network training failed on the Ascend platform. -- [I7I3EZ] Fix the issue that caused run_check() failure due to changes to the enumeration interface in Pillow version 10.0.0. If encountered in a lower version of MindSpore, install versions of Pillow below 10.0.0 to avoid this issue. -- [I7IZ8K] Fix accuracy issues with the assignsub interface in PyNative mode. -- [I7HGY0] Fix the issue that the loss of the functional programming does not converge in the PyNative data_sink mode. -- [I7J4N3] Fix the issue that the generation of Step Trace failed in Profiler dynamic Shape mode -- [I7J4N3] Fix the issue that there is no data displayed in the MindInsight parallel strategy view. -- [I79YY4] Fix SiLU operator error when high-order differential in PyNative mode. -- [I6NQJQ] Fix the issue of probabilistic failure in dynamic shape scenarios of the ScatterUpdate operator in PyNative mode. -- [I6Y4G5] Fix the issue of failure in dynamic Shape scenarios of the Conv3D operator in Graph mode. - -### Contributors - -Thanks goes to these wonderful people: - -alashkari,anzhengqi,archer2049,B.L.LAN,baihuawei,bichaoyang,BJ-WANG,Bokai Li,Brian-K,caifubi,caiyimeng,cathwong,changzherui,ChenDonYY,chenfei_mindspore,chengang,chengbin,chenhaozhe,chenjianping,chenkang,chenweifeng,chuht,chujinjin,davidanugraha,DavidFFFan,DeshiChen,douzhixing,emmmmtang,Erpim,Ethan,fangwenyi,fangzehua,fangzhou0329,fary86,fengyixing,gaoshuanglong,Gaoxiong,gaoyong10,gengdongjie,gongdaguo1,Greatpan,GuoZhibin,guozhijian,hangq,hanhuifeng,haozhang,hedongdong,Henry Shi,heterogeneous_to_backoff_2_0,huangbingjian,huanghui,huangxinjing,hujiahui8,hujingsong,huoxinyou,jachua,jiahongQian,jianghui58,jiangzhenguang,jiaorui,jiaoy1224,jijiarong,jjfeing,JoeyLin,json,JuiceZ,jxl,kairui_kou,KevinYi,kisnwang,KXiong,laiyongqiang,lanzhineng,liangchenghui,liangzelang,LiangZhibo,lianliguang,lichen,ligan,lijunbin,limingqi107,ling,linqingke,liubuyu,liuchao,liuchuting,liujunzhu,liuluobin,liutongtong9,liuyang811,lixiao,liyan2022,liyejun,liyuxia,looop5,luochao60,luojianing,luoyang,luoyuan,lyqlola,maning202007,maoyaomin,Margaret_wangrui,mayadong,MaZhiming,melody,mengyuanli,michaelzhu_70ab,Mohammad Motallebi,moran,NaCN,nomindcarry,OwenSec,panfengfeng,panshaowu,panzhihui,pkuliuliu,qinzheng,qiuzhongya,qujianwei,r1chardf1d0,Renyuan Zhang,RobinGrosman,shaojunsong,shenwei41,Soaringfish,tangdezhi_123,tanghuikang,tan-wei-cheng,TinaMengtingZhang,TronZhang,TuDouNi,VectorSL,wang_ziqi,wanghenchang,wangnan39,wangpingan,wangshaocong,wangshengnan123,wangtongyu6,weichaoran,wind-zyx,wqx,wtcheng,wujueying,wYann,XianglongZeng,xiaohanzhang,xiaotianci,xiaoyao,XinDu,xulei,xumengjuan1,xupan,xwkgch,yanghaoran,yangluhang,yangruoqi713,yangshuo,yangsijia,yangzhenzhang,yanzhenxiang2020,Yanzhi_YI,yao_yf,yefeng,yeyunpeng2020,Yi_zhang95,yide12,YijieChen,YingLai Lin,YingtongHu,youshu,yuchaojie,yuedongli,YuJianfeng,zangqx,ZengZitao,zhangbuxue,zhangdanyang,zhangdong,zhangfanghe,zhangqi,zhangqinghua,zhangyanhui,zhangyinxia,zhangyongxian,zhangzhaoju,zhanzhan,zhengzuohe,ZhidanLiu,zhixinaa,zhoufeng,zhouyaqiang0,zhuguodong,zhupuxu,zhuyuxiao,zichun_ye,zjun,zlq2020,zong_shuai,ZPaC,zuochuanyong,zyli2020,陈宇,范吉斌,冯一航,胡彬,宦晓玲,黄勇,雷元哲,李良灿,李林杰,刘崇鸣,刘力力,刘勇琪,吕浩宇,吕昱峰(Nate.River),没有窗户的小巷,沈竞兴,十六夜,王程浩,王禹程,王振邦,徐安越,徐永飞,杨旭华,于振华,俞涵,张清华,张澍坤,张栩浩,张学同,赵英灼,周超,周洪叶,朱家兴 - -Contributions of any kind are welcome! - -## MindSpore Lite 2.1.0 Release Notes - -### Major Features and Improvements - -#### MindSpore Lite Cloud Inference - -- [STABLE] Supports high-performance inference for single-device large model and single-node multi-device distributed large model at Ascend backend. -- [STABLE] Python API Ascend backend supports multiple models sharing workspace memory. -- [STABLE] [The weights can be shared by multiple models through ModelGroup](https://mindspore.cn/lite/docs/en/r2.1/use/cloud_infer/runtime_cpp.html#multiple-models-sharing-weights). For example, weights can be shared between full models and incremental models in the large model scenario. - -#### API - -The [Python](https://www.mindspore.cn/lite/api/en/r2.1/mindspore_lite/mindspore_lite.ModelGroup.html) and [C++](https://mindspore.cn/lite/api/en/r2.1/generate/classmindspore_ModelGroup.html) ModelGroup interface is added. The interface definition is as follows: - -```python -class ModelGroup - def __init__(self, flags=ModelGroupFlag.SHARE_WORKSPACE) - def add_model(self, models) - def cal_max_size_of_workspace(self, model_type, context) -``` - -```C++ -// class ModelGroup -ModelGroup(ModelGroupFlag flags = ModelGroupFlag::kShareWorkspace); -Status AddModel(const std::vector &model_path_list); -Status AddModel(const std::vector> &model_buff_list); -Status AddModel(const std::vector &model_list); -Status AddModel(const std::vector &model_list); -``` - -## MindSpore 2.0.0 Release Notes - -### Major Features and Improvements - -#### PyNative - -- [STABLE] Dynamic shape is fully supported on framework. For detailed operator support, refer to [Dynamic Shape Support Status of nn Interface](https://www.mindspore.cn/docs/en/r2.0/note/dynamic_shape_nn.html), [Dynamic Shape Support Status of ops Interface](https://www.mindspore.cn/docs/en/r2.0/note/dynamic_shape_func.html), and [Dynamic Shape Support Status of primitive Interface](https://www.mindspore.cn/docs/en/r2.0/note/dynamic_shape_primitive.html). - -#### AutoParallel - -- [STABLE] Build new MindFormers independent repositpry, providing distributed parallel suite, replacing mindspore.nn.transformer module. -- [DEMO] Distributed parallel operator Gather supports the BatchDim attribute. -- [DEMO] Streamline parallel supports specifying any dimension of the input data as the Batch dimension. - -### API Change - -#### operator - -- Add operator primitive for `mindspore.ops.AdaptiveAvgPool2D` . -- Add operator primitive for `mindspore.ops.BatchToSpaceNDV2` . -- Add operator primitive for `mindspore.ops.CeLU` . -- Add operator primitive for `mindspore.ops.ExtractVolumePatches` . -- Add operator primitive for `mindspore.ops.FFTWithSize` . -- Add operator primitive for `mindspore.ops.FillDiagonal` . -- Add operator primitive for `mindspore.ops.FractionalMaxPool3DWithFixedKsize` . -- Add operator primitive for `mindspore.ops.Im2Col` . -- Add operator primitive for `mindspore.ops.MaskedScatter` . -- Add operator primitive for `mindspore.ops.MatrixBandPart` . -- Add operator primitive for `mindspore.ops.MatrixInverse` . -- Add operator primitive for `mindspore.ops.MaxPoolWithArgmaxV2` . -- Add operator primitive for `mindspore.ops.Ormqr` . -- Add operator primitive for `mindspore.ops.RandpermV2` . -- Add operator primitive for `mindspore.ops.ResizeBicubic` . -- Add operator primitive for `mindspore.ops.Triu` . -- Add operator primitive for `mindspore.ops.Zeta` . - -#### Backwards Incompatible Change - -- Interface: mindspore.ops.MultitypeFuncGraph - - Change: The interface parameter doc_url is used as a test feature in MindSpore 2.0.0.rc1 version. After the optimization of MindSpore 2.0.0 version, users do not need to configure this parameter, so this parameter is deleted in MindSpore 2.0.0 version. - - - - - - - - - -
Original Interface Interface v2.0.0
-  mindspore.ops.MultitypeFuncGraph(name, read_value=False, doc_url="")
-  
-
-  mindspore.ops.MultitypeFuncGraph(name, read_value=False)
-  
-
- -- Interface: mindspore.set_context(auto_tune_mode="GA,RL") - - Change: The AutoTune tool has been deprecated, delete auto_tune_mode option, new tuning tools will be planned in the future. - -- Interface: mindspore.set_context(mode=PYNATIVE_MODE) - - Change: The default value is changed from GRAPH_MODE to PYNATIVE_MODE. - - Description: If the running mode is not set and the diagram mode needs to be set, use the following method: - mindspore.set_context(mode=GRAPH_MODE). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  mindspore.set_context(mode=GRAPH_MODE)
-  
-
-  mindspore.set_context(mode=PYNATIVE_MODE)
-  
-
- -- Interface: mindspore.train.Model.train - - Change: The default value of dataset_sink_mode is changed from True to False. - - Description: If dataset_sink_mode is not set and the data sinking mode needs to be set, use the following method: - Model.train(dataset_sink_mode=True). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  Model.train(dataset_sink_mode=True)
-  
-
-  Model.train(dataset_sink_mode=False)
-  
-
- -- Interface: mindspore.export - - Change: The file_format parameter is changed from AIR to no default value. - - Description: If file_format is not set in the original mode, you need to set file_format additionally. In this case, use the following method: - mindspore.export(net, *inputs, file_name, file_format="AIR", **kwargs). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  mindspore.export(net, *inputs, file_name,
-                   file_format="AIR", **kwargs)
-  
-
-  mindspore.export(net, *inputs, file_name,
-                   file_format, **kwargs)
-  
-
- -- Interface: mindspore.ops.norm - - Change: The ord parameter function is extended to support multiple forms. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.norm(input_x, axis, p=2, keep_dims=False, epsilon=1e-12)
-  >>> # Example:
-  >>> input = Tensor(np.array([[[1.0, 2.0], [3.0, 4.0]],
-  ...                          [[5.0, 6.0], [7.0, 8.0]]]).astype(np.float32))
-  >>> output = ops.norm(input, [0, 1], p=2)
-  
-  ops.norm(A, ord=None, dim=None, keepdim=False, *, dtype=None)
-  >>> # Example:
-  >>> input = Tensor(np.array([[[1.0, 2.0], [3.0, 4.0]],
-  ...                          [[5.0, 6.0], [7.0, 8.0]]]).astype(np.float32))
-  >>> output = ops.norm(input, ord=2, dim=(0, 1))
-  
-
- -- Interface: mindspore.Tensor.norm - - Change: The ord parameter function is extended to support multiple forms. - - Description: For details, see the example of ops.norm. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  Tensor.norm(axis, p=2, keep_dims=False, epsilon=1e-12)
-  
-
-  Tensor.norm(ord=None, dim=None, keepdim=False, *, dtype=None)
-  
-
- -- Interface: mindspore.ops.dropout - - Change: The seed0 and seed1 parameters are deleted and seed=None parameter is added. Instead of returning Tensors and masks, only Tensors are returned. The input parameter training=True is added. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.dropout(x, p=0.5, seed0=0, seed1=0)
-  >>> # Example:
-  >>> input = Tensor(((20, 16), (50, 50)),
-  ...                mindspore.float32)
-  >>> output, mask = dropout(x, p=0.5)
-  
-
-  ops.dropout(input, p=0.5, training=True, seed=None)
-  >>> # Example:
-  >>> input = Tensor(((20, 16), (50, 50)),
-  ...                mindspore.float32)
-  >>> output = ops.dropout(input, p=0.5,training=True)
-  
-
- -- Interface: mindspore.ops.dropout2d - - Change: Return value is changed from Tensor and mask to Tensor only. The input parameter training=True is added. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.dropout2d(x, p=0.5)
-  >>> # Example:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output, mask = dropout2d(input, 0.5)
-  
-
-  ops.dropout2d(input, p=0.5, training=True)
-  >>> # Example:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output = ops.dropout2d(input, 0.5, training=True)
-  
-
- -- Interface: mindspore.ops.dropout3d - - Change: Return value is changed from Tensor and mask to Tensor only. The input parameter training=True is added. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.dropout3d(x, p=0.5)
-  >>> # Example:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output, mask = dropout3d(input, 0.5)
-  
-
-  ops.dropout3d(input, p=0.5, training=True)
-  >>> # Example:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output = ops.dropout3d(input, 0.5, training=True)
-  
-
- -- Interface: mindspore.ops.std - - Change: The interface is reconstructed, and the interface usage mode is more consistent with user habits. - - Description: If parameter `unbiased` has been set, use the following alternative: `unbiased=False` -> `ddof=0`, `unbiased=True` -> `ddof=1`. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.std(input_x, axis=(), unbiased=True, keep_dims=False)
-  
-
-  ops.std(input, axis=None, ddof=0, keepdims=False)
-  
-
- -- Interface: mindspore.load_param_into_net - - Change: Parameters that are not loaded in the ckpt are added as return values. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  net_param = load_param_into_net()
-  
-
-  net_param, ckpt_param = load_param_into_net()
-  
-
- -- Interface: mindspore.nn.BCELoss - - Change: The default value of `reduction` is changed from 'none' to 'mean'. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  BCELoss(weight=None, reduction='none')
-  >>> # Example:
-  >>> weight = Tensor(np.array([[1.0, 2.0, 3.0],
-  ...                           [4.0, 3.3, 2.2]]),
-  ...                 mindspore.float32)
-  >>> loss = nn.BCELoss(weight=weight, reduction='mean')
-  >>> logits = Tensor(np.array([[0.1, 0.2, 0.3],
-  ...                           [0.5, 0.7, 0.9]]),
-  ...                 mindspore.float32)
-  >>> labels = Tensor(np.array([[0, 1, 0], [0, 0, 1]]),
-  ...                 mindspore.float32)
-  >>> output = loss(logits, labels)
-  >>> print(output)
-  >>> 1.8952923
-  
-
-  BCELoss(weight=None, reduction='mean')
-  >>> # Example:
-  >>> weight = Tensor(np.array([[1.0, 2.0, 3.0],
-  ...                           [4.0, 3.3, 2.2]]),
-  ...                 mindspore.float32)
-  >>> loss = nn.BCELoss(weight=weight)
-  >>> logits = Tensor(np.array([[0.1, 0.2, 0.3],
-  ...                           [0.5, 0.7, 0.9]]),
-  ...                 mindspore.float32)
-  >>> labels = Tensor(np.array([[0, 1, 0], [0, 0, 1]]),
-  ...                 mindspore.float32)
-  >>> output = loss(logits, labels)
-  >>> print(output)
-  >>> 1.8952923
-  
-
- -- Interface: mindspore.ops.split - - Change: The interface is reconstructed. The interface usage mode is more suitable for users. The sequence of the second and third parameters is adjusted, and the split_size_or_sections function is modified and extended. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.split(input_x, axis=0, output_num=1)
-  >>> # Example:
-  >>> input = Tensor(np.array([[1, 1, 1, 1], [2, 2, 2, 2]]),
-  ...                mindspore.int32)
-  >>> output = ops.split(input, axis=1, output_num=4)
-  
-
-  ops.split(tensor, split_size_or_sections, axis=0)
-  >>> # Example:
-  >>> input = Tensor(np.array([[1, 1, 1, 1], [2, 2, 2, 2]]),
-  ...                mindspore.int32)
-  >>> output = ops.split(input, split_size_or_sections=1, axis=1)
-  
-
- -- Interface: mindspore.Tensor.split - - Change: The interface is reconstructed. The interface usage mode is more suitable for users. The positions of the two parameters is adjusted, and the split_size_or_sections function is modified and extended. - - Description: For details, see the example of ops.split. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  Tensor.split(axis=0, output_num=1)
-  
-
-  Tensor.split(split_size_or_sections, axis=0)
-  
-
- -- Interface: mindspore.ops.pad - - Change: Modify the parameter name paddings to padding, and the mode and value functions are added. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.pad(input_x, paddings)
-  >>> # Example:
-  >>> input_x = Tensor(np.array([[-0.1, 0.3, 3.6],
-  ...                            [0.4, 0.5, -3.2]]),
-  ...                  mindspore.float32)
-  >>> paddings = ((1, 2), (2, 1))
-  >>> output = ops.pad(input_x, paddings)
-  
-
-  ops.pad(input_x, padding, mode='constant', value=None)
-  >>> # Example:
-  >>> input_x = Tensor(np.array([[-0.1, 0.3, 3.6],
-  ...                            [0.4, 0.5, -3.2]]),
-  ...                  mindspore.float32)
-  >>> paddings = (2, 1, 1, 2)
-  >>> output = ops.pad(input_x, paddings)
-  
-
- -- Interface: mindspore.ops.meshgrid - - Change: The input parameter is changed from `inputs` to `*input`. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.meshgrid(inputs, indexing='xy')
-  >>> # Example:
-  >>> x = Tensor(np.array([1, 2, 3, 4]).astype(np.int32))
-  >>> y = Tensor(np.array([5, 6, 7]).astype(np.int32))
-  >>> z = Tensor(np.array([8, 9, 0, 1, 2]).astype(np.int32))
-  output = ops.meshgrid((x, y, z), indexing='xy')
-  
-
-  ops.meshgrid(*inputs, indexing='xy')
-  >>> # Example:
-  >>> x = Tensor(np.array([1, 2, 3, 4]).astype(np.int32))
-  >>> y = Tensor(np.array([5, 6, 7]).astype(np.int32))
-  >>> z = Tensor(np.array([8, 9, 0, 1, 2]).astype(np.int32))
-  output = ops.meshgrid(x, y, z, indexing='xy')
-  
-
- -- Interface: mindspore.ops.max - - Change: Return value exchange sequence. The value is changed from "index, value" to "value, index". - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.max(x, axis=0, keep_dims=False)
-  >>> # Example:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> index, output = ops.max(input)
-  >>> print(index, output)
-  >>> 3 0.7
-  
-
-  ops.max(input, axis=None, keepdims=False, *, initial=None, where=True, return_indices=False)
-  >>> # Example:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> output, index = ops.max(input, axis=0)
-  >>> print(output, index)
-  
-
- -- Interface: mindspore.ops.min - - Change: Return value exchange sequence. The value is changed from "index, value" to "value, index". - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.min(x, axis=0, keep_dims=False)
-  >>> # Example:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> index, output = ops.min(input)
-  >>> 0 0.0
-  
-
-  ops.min(input, axis=None, keepdims=False, *, initial=None, where=True, return_indices=False)
-  >>> # Example:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> output, index = ops.min(input, keepdims=True)
-  >>> 0.0 0
-  
-
- -- Interface: mindspore.ops.random_gamma - - Change: The seed2 parameter is deleted and seed=0 is changed to None. The framework behavior is unified and complies with the actual application scenarios and habits of users. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.random_gamma(shape, alpha, seed=0, seed2=0)
-  
-
-  ops.random_gamma(shape, alpha, seed=None)
-  
-
- -- Interface: mindspore.ops.standard_laplace - - Change: The seed2 parameter is deleted and seed=0 is changed to None. The framework behavior is unified and complies with the actual application scenarios and habits of users. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.standard_laplace(shape, seed=0, seed2=0)
-  
-
-  ops.standard_laplace(shape, seed=None)
-  
-
- -- Interface: mindspore.ops.standard_normal - - Change: The seed2 parameter is deleted and seed=0 is changed to None. The framework behavior is unified and complies with the actual application scenarios and habits of users. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.standard_normal(shape, seed=0, seed2=0)
-  
-
-  ops.standard_normal(shape, seed=None)
-  
-
- -- Interface: mindspore.ops.bernoulli - - Change: The default value of seed is changed from -1 to None. Meets the actual application scenario. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.bernoulli(x, p=0.5, seed=-1)
-  
-
-  ops.bernoulli(input, p=0.5, seed=None)
-  
-
- -- Interface: mindspore.data_sink - - Change: Deleted the steps parameter. Parameter name jit is changed to jit_config, and new input_signature parameter is added. The usability is improved to meet the requirements of actual application scenarios. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  mindspore.data_sink(fn, dataset, steps,
-                      sink_size=1, jit=False)
-  
-
-  mindspore.data_sink(fn, dataset, sink_size=1,
-                      jit_config=None, input_signature=None)
-  
-
- -- Interface: mindspore.ops.conv2d - - Change: Extend Interface Function. Add the bias parameter and modify the parameter name and parameter sequence. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  conv2d(inputs, weight, pad_mode="valid",
-         padding=0, stride=1, dilation=1, group=1)
-  
-
-  conv2d(input, weight, bias=None, stride=1,
-         pad_mode="valid", padding=0, dilation=1, groups=1)
-  
-
- -- Interface: mindspore.dataset.vision.Pad - - Change: Adjust the input parameter padding of Pad, RandomCrop, and RandomCropWithBbox. When the input length of Padding is 2, the first value is used to fill the left/upper boundary, the second value is used to fill the right/lower boundary, and the first value is used to fill the left/right boundary. Fill the upper/lower boundary with the second value. - - Description: The padding parameter whose size is 2 is not compatible with the effect of the earlier version. The padding parameter needs to be explicitly represented (left, right, top, and bottom). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  mindspore.dataset.vision.Pad(padding=(1,2))
-  Indicates that the left/upper part of the image is filled with 1 pixel,
-  and the right/down part is filled with 2 pixels.
-  
-
-  mindspore.dataset.vision.Pad(padding=(1,2,1,2))
-  Indicates that the left/upper part of the image is filled with 1 pixel,
-  and the right/down part is filled with 2 pixels.
-  
-
- -- Interface: mindspore.dataset.Dataset.map - - Change: Delete the column_order parameter. In most cases, output_columns and column_order have the same value. Therefore, column_order does not need to be transferred. To adjust the sequence of data columns, use mindspore.dataset.Dataset.project. - - Description: - - 1. If the column sequence does not need to be changed, delete the column_order parameter. - 2. If you need to specify the data column sequence, delete the column_order parameter and add a project method to the end of the parameter for column transformation (as in the following example). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  >>> dataset = dataset.map(operations=[transforms],
-  ...                       input_columns=["column_a"],
-  ...                       output_columns=["column_b", "column_c"],
-  ...                       column_order=["column_c", "column_b"])
-  
-
-  >>> dataset = dataset.map(operations=[transforms],
-  ...                       input_columns=["column_a"],
-  ...                       output_columns=["column_b", "column_c"])
-  >>> dataset = dataset.project(["column_c", column_b"])")
-  
-
- -- Interface: mindspore.dataset.Dataset.batch - - Change: Delete the column_order parameter. In most cases, output_columns and column_order have the same value. Therefore, column_order does not need to be transferred. To adjust the sequence of data columns, use mindspore.dataset.Dataset.project. - - Description: - - 1. If the column sequence does not need to be changed, delete the column_order parameter. - 2. If you need to specify the data column sequence, delete the column_order parameter and add a project method to the end of the parameter for column transformation (as in the following example). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  >>> dataset = dataset.batch(batch_size=4,
-  ...                         input_columns=["column_a"],
-  ...                         output_columns=["column_b", "column_c"],
-  ...                         column_order=["column_c", "column_b"])
-  
-
-  >>> dataset = dataset.batch(batch_size=4, input_columns=["column_a"]
-  ...                         output_columns=["column_b", "column_c"])
-  >>> dataset = dataset.project(["column_c", column_b"])")
-  
-
- -- Interface: mindspore.dataset.Dataset.batch - - Change: Split the batch method into two methods: batch and padded_batch. The pad_info parameter is moved from the batch method to the padded_batch method. - - Description: To use the pad_info parameter, use the padded_batch method instead. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  >>> dataset = dataset.batch(batch_size=4,
-  ...                         drop_remainder=True, pad_info=...)
-  
-
-  >>> dataset = dataset.padded_batch(batch_size=4,
-  ...                                drop_remainder=True, pad_info=...)
-  
-
- -### Bug fixes - -- [I62I3J] fix inference failure of BGCF network on Ascend 310 -- [I7C2W3] fix error issuse of null pointer when enabling multiple loss in parallel pipeline scenarios - -### Contributors - -Thanks goes to these wonderful people: - -alashkari,anzhengqi,archer2049,B.L.LAN,baihuawei,bichaoyang,BJ-WANG,Bokai Li,Brian-K,caifubi,caiyimeng,cathwong,changzherui,ChenDonYY,chenfei_mindspore,chengang,chengbin,chenhaozhe,chenjianping,chenkang,chenweifeng,chuht,chujinjin,davidanugraha,DavidFFFan,DeshiChen,douzhixing,emmmmtang,Erpim,Ethan,fangwenyi,fangzehua,fangzhou0329,fary86,fengyixing,gaoshuanglong,Gaoxiong,gaoyong10,gengdongjie,gongdaguo1,Greatpan,GuoZhibin,guozhijian,hangq,hanhuifeng,haozhang,hedongdong,Henry Shi,heterogeneous_to_backoff_2_0,huangbingjian,huanghui,huangxinjing,hujiahui8,hujingsong,huoxinyou,jachua,jiahongQian,jianghui58,jiangzhenguang,jiaorui,jiaoy1224,jijiarong,jjfeing,JoeyLin,json,JuiceZ,jxl,kairui_kou,KevinYi,kisnwang,KXiong,laiyongqiang,lanzhineng,liangchenghui,liangzelang,LiangZhibo,lianliguang,lichen,ligan,lijunbin,limingqi107,ling,linqingke,liubuyu,liuchao,liuchuting,liujunzhu,liuluobin,liutongtong9,liuyang811,lixiao,liyan2022,liyejun,liyuxia,looop5,luochao60,luojianing,luoyang,luoyuan,lyqlola,maning202007,maoyaomin,Margaret_wangrui,mayadong,MaZhiming,melody,mengyuanli,michaelzhu_70ab,Mohammad Motallebi,moran,NaCN,nomindcarry,OwenSec,panfengfeng,panshaowu,panzhihui,pkuliuliu,qinzheng,qiuzhongya,qujianwei,r1chardf1d0,Renyuan Zhang,RobinGrosman,shaojunsong,shenwei41,Soaringfish,tangdezhi_123,tanghuikang,tan-wei-cheng,TinaMengtingZhang,TronZhang,TuDouNi,VectorSL,wang_ziqi,wanghenchang,wangnan39,wangpingan,wangshaocong,wangshengnan123,wangtongyu6,weichaoran,wind-zyx,wqx,wtcheng,wujueying,wYann,XianglongZeng,xiaohanzhang,xiaotianci,xiaoyao,XinDu,xulei,xumengjuan1,xupan,xwkgch,yanghaoran,yangluhang,yangruoqi713,yangshuo,yangsijia,yangzhenzhang,yanzhenxiang2020,Yanzhi_YI,yao_yf,yefeng,yeyunpeng2020,Yi_zhang95,yide12,YijieChen,YingLai Lin,YingtongHu,youshu,yuchaojie,yuedongli,YuJianfeng,zangqx,ZengZitao,zhangbuxue,zhangdanyang,zhangdong,zhangfanghe,zhangqi,zhangqinghua,zhangyanhui,zhangyinxia,zhangyongxian,zhangzhaoju,zhanzhan,zhengzuohe,ZhidanLiu,zhixinaa,zhoufeng,zhouyaqiang0,zhuguodong,zhupuxu,zhuyuxiao,zichun_ye,zjun,zlq2020,zong_shuai,ZPaC,zuochuanyong,zyli2020,陈宇,范吉斌,冯一航,胡彬,宦晓玲,黄勇,雷元哲,李良灿,李林杰,刘崇鸣,刘力力,刘勇琪,吕浩宇,吕昱峰(Nate.River),没有窗户的小巷,沈竞兴,十六夜,王程浩,王禹程,王振邦,徐安越,徐永飞,杨旭华,于振华,俞涵,张清华,张澍坤,张栩浩,张学同,赵英灼,周超,周洪叶,朱家兴 - -Contributions of any kind are welcome! - -## MindSpore 2.0.0-rc1 Release Notes - -### Major Features and Improvements - -#### FrontEnd - -- [BETA] Statement with "return", "return None" and with no return of function are supported in `GRAPH_MODE`. -- [BETA] Object with `list` type are supported in `GRAPH_MODE`. -- [BETA] Statement with "raise" are supported in variable condition situation in `GRAPH_MODE`. -- [STABLE] Functional call supports data sinking mode. -- [BETA] The Transformer layer in nn module is added to provide easy-to-use Transformer APIs. Batch_size does not need to be defined. Dynamic seq_length is supported. - -#### DataSet - -- [STABLE] In the Ascend environment,the timeout waiting time in data sink mode is adjusted to 1900s by default. This solves the problem that the GetNext operator may time out due to environment resource competition and large computing workload in data sinking mode. -- [STABLE] MindRecord supports to query the schemas and number samples. MindRecord provides multi-process writing mode, allowing users to generate MindRecord data files in parallel. -- [STABLE] The Dataset pipeline can process any Python object. For details, see [Supporting Python Objects in Dataset Pipeline](https://www.mindspore.cn/tutorials/en/r2.0/advanced/dataset/python_objects.html). - -#### AutoParallel - -- [STABLE] The strategies of whole parameters can be saved when saving strategy. -- [STABLE] The Conv3D/MaxPool3D/AvgPool3D distributed operator is supported. -- [STABLE] Support operator-level parallelism and optimizer-level parallelism under the PyNative with shard: parallel training and the Model API are decoupled to provide basic parallel expression capabilities. -- [STABLE] Support operator-level parallelism, and optimizer-level parallelism under the Graph mode: parallel training and the Model API are decoupled to provide basic parallel expression capabilities. -- [BETA] Supports customized distributed graph segmentation, improving the flexibility of distributed training. - -#### Runtime - -- [STABLE] Control flow supports subgraph sink. -- [STABLE] Support CUDA 11.6. -- [STABLE] Support for operator selection and execution of List/Tuple/Scalar type kernel to match native Python expression. -- [STABLE] Kernel that is not supported by hardware can automatically select CPU kernel. -- [STABLE] Support heterogeneous execution within subgraph. - -#### Ascend - -- [STABLE] Support overflow detection scheme and HCCL runtime overflow check. -- [STABLE] Support dump of communication operators. - -#### Profiler - -- [STABLE] Rich Profiler collection item configuration, users can collect performance data in more detail. - -#### Dump - -- [BETA] Single card in PyNatvie mode supports operator overflow detection. -- [BETA] Graph mode supports hccl operator dump. - -### API Change - -- [STABLE] Add computing APIs, such as MaxUnpool, ReplicationPad, and GaussianNLLLoss. - For details, visit . -- [STABLE] Extend inventory API functions, such as AvgPool, pad, norm, and interplate. - -#### operator - -- [BETA] Add operator primitive for `mindspore.ops.AdaptiveAvgPool3D`. -- [BETA] Add operator primitive for `mindspore.ops.AffineGrid`. -- [BETA] Add operator primitive for `mindspore.ops.Angle`. -- [BETA] Add operator primitive for `mindspore.ops.BartlettWindow`. -- [BETA] Add operator primitive for `mindspore.ops.Bernoulli`. -- [BETA] Add operator primitive for `mindspore.ops.BesselI0`. -- [BETA] Add operator primitive for `mindspore.ops.BesselI1`. -- [BETA] Add operator primitive for `mindspore.ops.BesselJ0`. -- [BETA] Add operator primitive for `mindspore.ops.BesselJ1`. -- [BETA] Add operator primitive for `mindspore.ops.BesselK0`. -- [BETA] Add operator primitive for `mindspore.ops.BesselK0e`. -- [BETA] Add operator primitive for `mindspore.ops.BesselK1`. -- [BETA] Add operator primitive for `mindspore.ops.BesselK1e`. -- [BETA] Add operator primitive for `mindspore.ops.BesselY0`. -- [BETA] Add operator primitive for `mindspore.ops.BesselY1`. -- [BETA] Add operator primitive for `mindspore.ops.Bincount`. -- [BETA] Add operator primitive for `mindspore.ops.BlackmanWindow`. -- [BETA] Add operator primitive for `mindspore.ops.ChannelShuffle`. -- [BETA] Add operator primitive for `mindspore.ops.Cholesky`. -- [BETA] Add operator primitive for `mindspore.ops.Col2Im`. -- [BETA] Add operator primitive for `mindspore.ops.Complex`. -- [BETA] Add operator primitive for `mindspore.ops.ComplexAbs`. -- [BETA] Add operator primitive for `mindspore.ops.Cross`. -- [BETA] Add operator primitive for `mindspore.ops.CTCLossV2`. -- [BETA] Add operator primitive for `mindspore.ops.Cummin`. -- [BETA] Add operator primitive for `mindspore.ops.Diag`. -- [BETA] Add operator primitive for `mindspore.ops.Digamma`. -- [BETA] Add operator primitive for `mindspore.ops.Expand`. -- [BETA] Add operator primitive for `mindspore.ops.Fmax`. -- [BETA] Add operator primitive for `mindspore.ops.Gcd`. -- [BETA] Add operator primitive for `mindspore.ops.Geqrf`. -- [BETA] Add operator primitive for `mindspore.ops.GLU`. -- [BETA] Add operator primitive for `mindspore.ops.GridSampler2D`. -- [BETA] Add operator primitive for `mindspore.ops.GridSampler3D`. -- [BETA] Add operator primitive for `mindspore.ops.HammingWindow`. -- [BETA] Add operator primitive for `mindspore.ops.Heaviside`. -- [BETA] Add operator primitive for `mindspore.ops.Hypot`. -- [BETA] Add operator primitive for `mindspore.ops.Igamma`. -- [BETA] Add operator primitive for `mindspore.ops.IndexFill`. -- [BETA] Add operator primitive for `mindspore.ops.InplaceIndexAdd`. -- [BETA] Add operator primitive for `mindspore.ops.InplaceUpdateV2`. -- [BETA] Add operator primitive for `mindspore.ops.Lcm`. -- [BETA] Add operator primitive for `mindspore.ops.LeftShift`. -- [BETA] Add operator primitive for `mindspore.ops.LogicalXor`. -- [BETA] Add operator primitive for `mindspore.ops.Logit`. -- [BETA] Add operator primitive for `mindspore.ops.LogSpace`. -- [BETA] Add operator primitive for `mindspore.ops.LuUnpack`. -- [BETA] Add operator primitive for `mindspore.ops.MatrixDiagPartV3`. -- [BETA] Add operator primitive for `mindspore.ops.MatrixDiagV3`. -- [BETA] Add operator primitive for `mindspore.ops.MatrixSetDiagV3`. -- [BETA] Add operator primitive for `mindspore.ops.MaxPool3DWithArgmax`. -- [BETA] Add operator primitive for `mindspore.ops.MaxUnpool2D`. -- [BETA] Add operator primitive for `mindspore.ops.MaxUnpool3D`. -- [BETA] Add operator primitive for `mindspore.ops.MultiMarginLoss`. -- [BETA] Add operator primitive for `mindspore.ops.MultinomialWithReplacement`. -- [BETA] Add operator primitive for `mindspore.ops.Mvlgamma`. -- [BETA] Add operator primitive for `mindspore.ops.NanToNum`. -- [BETA] Add operator primitive for `mindspore.ops.NextAfter`. -- [BETA] Add operator primitive for `mindspore.ops.Orgqr`. -- [BETA] Add operator primitive for `mindspore.ops.Polygamma`. -- [BETA] Add operator primitive for `mindspore.ops.ResizeBilinearV2`. -- [BETA] Add operator primitive for `mindspore.ops.RightShift`. -- [BETA] Add operator primitive for `mindspore.ops.ScatterNdDiv`. -- [BETA] Add operator primitive for `mindspore.ops.ScatterNdMul`. -- [BETA] Add operator primitive for `mindspore.ops.SearchSorted`. -- [BETA] Add operator primitive for `mindspore.ops.Sinc`. -- [BETA] Add operator primitive for `mindspore.ops.Trace`. -- [BETA] Add operator primitive for `mindspore.ops.Tril`. -- [BETA] Add operator primitive for `mindspore.ops.TrilIndices`. -- [BETA] Add operator primitive for `mindspore.ops.TriuIndices`. -- [BETA] Add operator primitive for `mindspore.ops.UniqueConsecutive`. -- [STABLE] Add operator primitive for `mindspore.ops.Cummax`. -- [STABLE] Add operator primitive for `mindspore.ops.FillV2`. -- [STABLE] Add operator primitive for `mindspore.ops.IsClose`. -- [STABLE] Add operator primitive for `mindspore.ops.MatrixSolve`. -- [STABLE] Add operator primitive for `mindspore.ops.Median`. -- [STABLE] Add operator primitive for `mindspore.ops.MultilabelMarginLoss`. -- [STABLE] Add operator primitive for `mindspore.ops.NonZero`. -- [STABLE] Add operator primitive for `mindspore.ops.Pdist`. -- [STABLE] Add operator primitive for `mindspore.ops.Polar`. -- [STABLE] Add operator primitive for `mindspore.ops.RandomGamma`. -- [STABLE] Add operator primitive for `mindspore.ops.RandomPoisson`. -- [STABLE] Add operator primitive for `mindspore.ops.RandomShuffle`. -- [STABLE] Add operator primitive for `mindspore.ops.Renorm`. -- [STABLE] Add operator primitive for `mindspore.ops.ScatterNdMax`. -- [STABLE] Add operator primitive for `mindspore.ops.ScatterNdMin`. -- [STABLE] Add operator primitive for `mindspore.ops.Svd`. -- [STABLE] Add operator primitive for `mindspore.ops.TripletMarginLoss`. - -#### Deleted APIs - -- The `mindspore.compression` feature was deprecated at MindSpore 1.8 and is removed in this version. - The following `mindspore.nn.quant` interfaces are also removed simultaneously: `mindspore.nn.FakeQuantWithMinMaxObserver`, `mindspore.nn.Conv2dBnFoldQuantOneConv`, `mindspore.nn.Conv2dBnFoldQuant`, `mindspore.nn.Conv2dBnWithoutFoldQuant`, `mindspore.nn.Conv2dQuant`, `mindspore.nn.DenseQuant`, `mindspore.nn.ActQuant`, `mindspore.nn.TensorAddQuant`, `mindspore.nn.ActQuant`, `mindspore.nn.MulQuant`. Please use [MindSpore Golden Stick](https://gitee.com/mindspore/golden-stick) instead to implement QuantAwareTraining in MindSpore. -- The `mindspore.dataset.close_pool`, `mindspore.dataset.to_device`, and `mindspore.dataset.set_dynamic_columns` interfaces are discarded in earlier version and being removed in this version. - -#### Backwards Incompatible Change - -- Interface: mindspore.set_context(mode=PYNATIVE_MODE) - - Change: The default value is changed from GRAPH_MODE to PYNATIVE_MODE. - - Description: If the running mode is not set and the diagram mode needs to be set, use the following method: - mindspore.set_context(mode=GRAPH_MODE). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  mindspore.set_context(mode=GRAPH_MODE)
-  
-
-  mindspore.set_context(mode=PYNATIVE_MODE)
-  
-
- -- Interface: mindspore.train.Model.train - - Change: The default value of dataset_sink_mode is changed from True to False. - - Description: If dataset_sink_mode is not set and the data sinking mode needs to be set, use the following method: - Model.train(dataset_sink_mode=True). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  Model.train(dataset_sink_mode=True)
-  
-
-  Model.train(dataset_sink_mode=False)
-  
-
- -- Interface: mindspore.export - - Change: The file_format parameter is changed from AIR to no default value. - - Description: If file_format is not set in the original mode, you need to set file_format additionally. In this case, use the following method: - mindspore.export(net, *inputs, file_name, file_format="AIR", **kwargs). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  mindspore.export(net, *inputs, file_name,
-                   file_format="AIR", **kwargs)
-  
-
-  mindspore.export(net, *inputs, file_name,
-                   file_format, **kwargs)
-  
-
- -- Interface: mindspore.ops.norm - - Change: The ord parameter function is extended to support multiple forms. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.norm(input_x, axis, p=2, keep_dims=False, epsilon=1e-12)
-  >>> # Example:
-  >>> input = Tensor(np.array([[[1.0, 2.0], [3.0, 4.0]],
-  ...                          [[5.0, 6.0], [7.0, 8.0]]]).astype(np.float32))
-  >>> output = ops.norm(input, [0, 1], p=2)
-  
-  ops.norm(A, ord=None, dim=None, keepdim=False, *, dtype=None)
-  >>> # Example:
-  >>> input = Tensor(np.array([[[1.0, 2.0], [3.0, 4.0]],
-  ...                          [[5.0, 6.0], [7.0, 8.0]]]).astype(np.float32))
-  >>> output = ops.norm(input, ord=2, dim=(0, 1))
-  
-
- -- Interface: mindspore.Tensor.norm - - Change: The ord parameter function is extended to support multiple forms. - - Description: For details, see the example of ops.norm. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  Tensor.norm(axis, p=2, keep_dims=False, epsilon=1e-12)
-  
-
-  Tensor.norm(ord=None, dim=None, keepdim=False, *, dtype=None)
-  
-
- -- Interface: mindspore.ops.dropout - - Change: The seed0 and seed1 parameters are deleted and seed=None parameter is added. Instead of returning Tensors and masks, only Tensors are returned. The input parameter training=True is added. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.dropout(x, p=0.5, seed0=0, seed1=0)
-  >>> # Example:
-  >>> input = Tensor(((20, 16), (50, 50)),
-  ...                mindspore.float32)
-  >>> output, mask = dropout(x, p=0.5)
-  
-
-  ops.dropout(input, p=0.5, training=True, seed=None)
-  >>> # Example:
-  >>> input = Tensor(((20, 16), (50, 50)),
-  ...                mindspore.float32)
-  >>> output = ops.dropout(input, p=0.5,training=True)
-  
-
- -- Interface: mindspore.ops.dropout2d - - Change: Return value is changed from Tensor and mask to Tensor only. The input parameter training=True is added. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.dropout2d(x, p=0.5)
-  >>> # Example:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output, mask = dropout2d(input, 0.5)
-  
-
-  ops.dropout2d(input, p=0.5, training=True)
-  >>> # Example:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output = ops.dropout2d(input, 0.5, training=True)
-  
-
- -- Interface: mindspore.ops.dropout3d - - Change: Return value is changed from Tensor and mask to Tensor only. The input parameter training=True is added. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.dropout3d(x, p=0.5)
-  >>> # Example:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output, mask = dropout3d(input, 0.5)
-  
-
-  ops.dropout3d(input, p=0.5, training=True)
-  >>> # Example:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output = ops.dropout3d(input, 0.5, training=True)
-  
-
- -- Interface: mindspore.ops.std - - Change: The interface is reconstructed, and the interface usage mode is more consistent with user habits. - - Description: If parameter `unbiased` has been set, use the following alternative: `unbiased=False` -> `ddof=0`, `unbiased=True` -> `ddof=1`. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.std(input_x, axis=(), unbiased=True, keep_dims=False)
-  
-
-  ops.std(input, axis=None, ddof=0, keepdims=False)
-  
-
- -- Interface: mindspore.load_param_into_net - - Change: Parameters that are not loaded in the ckpt are added as return values. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  net_param = load_param_into_net()
-  
-
-  net_param, ckpt_param = load_param_into_net()
-  
-
- -- Interface: mindspore.nn.BCELoss - - Change: The default value of `reduction` is changed from 'none' to 'mean'. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  BCELoss(weight=None, reduction='none')
-  >>> # Example:
-  >>> weight = Tensor(np.array([[1.0, 2.0, 3.0],
-  ...                           [4.0, 3.3, 2.2]]),
-  ...                 mindspore.float32)
-  >>> loss = nn.BCELoss(weight=weight, reduction='mean')
-  >>> logits = Tensor(np.array([[0.1, 0.2, 0.3],
-  ...                           [0.5, 0.7, 0.9]]),
-  ...                 mindspore.float32)
-  >>> labels = Tensor(np.array([[0, 1, 0], [0, 0, 1]]),
-  ...                 mindspore.float32)
-  >>> output = loss(logits, labels)
-  >>> print(output)
-  >>> 1.8952923
-  
-
-  BCELoss(weight=None, reduction='mean')
-  >>> # Example:
-  >>> weight = Tensor(np.array([[1.0, 2.0, 3.0],
-  ...                           [4.0, 3.3, 2.2]]),
-  ...                 mindspore.float32)
-  >>> loss = nn.BCELoss(weight=weight)
-  >>> logits = Tensor(np.array([[0.1, 0.2, 0.3],
-  ...                           [0.5, 0.7, 0.9]]),
-  ...                 mindspore.float32)
-  >>> labels = Tensor(np.array([[0, 1, 0], [0, 0, 1]]),
-  ...                 mindspore.float32)
-  >>> output = loss(logits, labels)
-  >>> print(output)
-  >>> 1.8952923
-  
-
- -- Interface: mindspore.ops.split - - Change: The interface is reconstructed. The interface usage mode is more suitable for users. The sequence of the second and third parameters is adjusted, and the split_size_or_sections function is modified and extended. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.split(input_x, axis=0, output_num=1)
-  >>> # Example:
-  >>> input = Tensor(np.array([[1, 1, 1, 1], [2, 2, 2, 2]]),
-  ...                mindspore.int32)
-  >>> output = ops.split(input, axis=1, output_num=4)
-  
-
-  ops.split(tensor, split_size_or_sections, axis=0)
-  >>> # Example:
-  >>> input = Tensor(np.array([[1, 1, 1, 1], [2, 2, 2, 2]]),
-  ...                mindspore.int32)
-  >>> output = ops.split(input, split_size_or_sections=1, axis=1)
-  
-
- -- Interface: mindspore.Tensor.split - - Change: The interface is reconstructed. The interface usage mode is more suitable for users. The positions of the two parameters is adjusted, and the split_size_or_sections function is modified and extended. - - Description: For details, see the example of ops.split. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  Tensor.split(axis=0, output_num=1)
-  
-
-  Tensor.split(split_size_or_sections, axis=0)
-  
-
- -- Interface: mindspore.ops.pad - - Change: Modify the parameter name paddings to padding, and the mode and value functions are added. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.pad(input_x, paddings)
-  >>> # Example:
-  >>> input_x = Tensor(np.array([[-0.1, 0.3, 3.6],
-  ...                            [0.4, 0.5, -3.2]]),
-  ...                  mindspore.float32)
-  >>> paddings = ((1, 2), (2, 1))
-  >>> output = ops.pad(input_x, paddings)
-  
-
-  ops.pad(input_x, padding, mode='constant', value=None)
-  >>> # Example:
-  >>> input_x = Tensor(np.array([[-0.1, 0.3, 3.6],
-  ...                            [0.4, 0.5, -3.2]]),
-  ...                  mindspore.float32)
-  >>> paddings = (2, 1, 1, 2)
-  >>> output = ops.pad(input_x, paddings)
-  
-
- -- Interface: mindspore.ops.meshgrid - - Change: The input parameter is changed from `inputs` to `*input`. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.meshgrid(inputs, indexing='xy')
-  >>> # Example:
-  >>> x = Tensor(np.array([1, 2, 3, 4]).astype(np.int32))
-  >>> y = Tensor(np.array([5, 6, 7]).astype(np.int32))
-  >>> z = Tensor(np.array([8, 9, 0, 1, 2]).astype(np.int32))
-  output = ops.meshgrid((x, y, z), indexing='xy')
-  
-
-  ops.meshgrid(*inputs, indexing='xy')
-  >>> # Example:
-  >>> x = Tensor(np.array([1, 2, 3, 4]).astype(np.int32))
-  >>> y = Tensor(np.array([5, 6, 7]).astype(np.int32))
-  >>> z = Tensor(np.array([8, 9, 0, 1, 2]).astype(np.int32))
-  output = ops.meshgrid(x, y, z, indexing='xy')
-  
-
- -- Interface: mindspore.ops.max - - Change: Return value exchange sequence. The value is changed from "index, value" to "value, index". - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.max(x, axis=0, keep_dims=False)
-  >>> # Example:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> index, output = ops.max(input)
-  >>> print(index, output)
-  >>> 3 0.7
-  
-
-  ops.max(input, axis=None, keepdims=False, *, initial=None, where=True, return_indices=False)
-  >>> # Example:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> output, index = ops.max(input, axis=0)
-  >>> print(output, index)
-  
-
- -- Interface: mindspore.ops.min - - Change: Return value exchange sequence. The value is changed from "index, value" to "value, index". - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.min(x, axis=0, keep_dims=False)
-  >>> # Example:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> index, output = ops.min(input)
-  >>> 0 0.0
-  
-
-  ops.min(input, axis=None, keepdims=False, *, initial=None, where=True, return_indices=False)
-  >>> # Example:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> output, index = ops.min(input, keepdims=True)
-  >>> 0.0 0
-  
-
- -- Interface: mindspore.ops.random_gamma - - Change: The seed2 parameter is deleted and seed=0 is changed to None. The framework behavior is unified and complies with the actual application scenarios and habits of users. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.random_gamma(shape, alpha, seed=0, seed2=0)
-  
-
-  ops.random_gamma(shape, alpha, seed=None)
-  
-
- -- Interface: mindspore.ops.standard_laplace - - Change: The seed2 parameter is deleted and seed=0 is changed to None. The framework behavior is unified and complies with the actual application scenarios and habits of users. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.standard_laplace(shape, seed=0, seed2=0)
-  
-
-  ops.standard_laplace(shape, seed=None)
-  
-
- -- Interface: mindspore.ops.standard_normal - - Change: The seed2 parameter is deleted and seed=0 is changed to None. The framework behavior is unified and complies with the actual application scenarios and habits of users. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.standard_normal(shape, seed=0, seed2=0)
-  
-
-  ops.standard_normal(shape, seed=None)
-  
-
- -- Interface: mindspore.ops.bernoulli - - Change: The default value of seed is changed from -1 to None. Meets the actual application scenario. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  ops.bernoulli(x, p=0.5, seed=-1)
-  
-
-  ops.bernoulli(input, p=0.5, seed=None)
-  
-
- -- Interface: mindspore.data_sink - - Change: Deleted the steps parameter. Parameter name jit is changed to jit_config, and new input_signature parameter is added. The usability is improved to meet the requirements of actual application scenarios. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  mindspore.data_sink(fn, dataset, steps,
-                      sink_size=1, jit=False)
-  
-
-  mindspore.data_sink(fn, dataset, sink_size=1,
-                      jit_config=None, input_signature=None)
-  
-
- -- Interface: mindspore.ops.conv2d - - Change: Extend Interface Function. Add the bias parameter and modify the parameter name and parameter sequence. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  conv2d(inputs, weight, pad_mode="valid",
-         padding=0, stride=1, dilation=1, group=1)
-  
-
-  conv2d(input, weight, bias=None, stride=1,
-         pad_mode="valid", padding=0, dilation=1, groups=1)
-  
-
- -- Interface: mindspore.dataset.vision.Pad - - Change: Adjust the input parameter padding of Pad, RandomCrop, and RandomCropWithBbox. When the input length of Padding is 2, the first value is used to fill the left/upper boundary, the second value is used to fill the right/lower boundary, and the first value is used to fill the left/right boundary. Fill the upper/lower boundary with the second value. - - Description: The padding parameter whose size is 2 is not compatible with the effect of the earlier version. The padding parameter needs to be explicitly represented (left, right, top, and bottom). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  mindspore.dataset.vision.Pad(padding=(1,2))
-  Indicates that the left/upper part of the image is filled with 1 pixel,
-  and the right/down part is filled with 2 pixels.
-  
-
-  mindspore.dataset.vision.Pad(padding=(1,2,1,2))
-  Indicates that the left/upper part of the image is filled with 1 pixel,
-  and the right/down part is filled with 2 pixels.
-  
-
- -- Interface: mindspore.dataset.Dataset.map - - Change: Delete the column_order parameter. In most cases, output_columns and column_order have the same value. Therefore, column_order does not need to be transferred. To adjust the sequence of data columns, use mindspore.dataset.Dataset.project. - - Description: - - 1. If the column sequence does not need to be changed, delete the column_order parameter. - 2. If you need to specify the data column sequence, delete the column_order parameter and add a project method to the end of the parameter for column transformation (as in the following example). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  >>> dataset = dataset.map(operations=[transforms],
-  ...                       input_columns=["column_a"],
-  ...                       output_columns=["column_b", "column_c"],
-  ...                       column_order=["column_c", "column_b"])
-  
-
-  >>> dataset = dataset.map(operations=[transforms],
-  ...                       input_columns=["column_a"],
-  ...                       output_columns=["column_b", "column_c"])
-  >>> dataset = dataset.project(["column_c", column_b"])")
-  
-
- -- Interface: mindspore.dataset.Dataset.batch - - Change: Delete the column_order parameter. In most cases, output_columns and column_order have the same value. Therefore, column_order does not need to be transferred. To adjust the sequence of data columns, use mindspore.dataset.Dataset.project. - - Description: - - 1. If the column sequence does not need to be changed, delete the column_order parameter. - 2. If you need to specify the data column sequence, delete the column_order parameter and add a project method to the end of the parameter for column transformation (as in the following example). - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  >>> dataset = dataset.batch(batch_size=4,
-  ...                         input_columns=["column_a"],
-  ...                         output_columns=["column_b", "column_c"],
-  ...                         column_order=["column_c", "column_b"])
-  
-
-  >>> dataset = dataset.batch(batch_size=4, input_columns=["column_a"]
-  ...                         output_columns=["column_b", "column_c"])
-  >>> dataset = dataset.project(["column_c", column_b"])")
-  
-
- -- Interface: mindspore.dataset.Dataset.batch - - Change: Split the batch method into two methods: batch and padded_batch. The pad_info parameter is moved from the batch method to the padded_batch method. - - Description: To use the pad_info parameter, use the padded_batch method instead. - - - - - - - - - -
Original Interface Interface v2.0.0-rc1
-  >>> dataset = dataset.batch(batch_size=4,
-  ...                         drop_remainder=True, pad_info=...)
-  
-
-  >>> dataset = dataset.padded_batch(batch_size=4,
-  ...                                drop_remainder=True, pad_info=...)
-  
-
- -### Bug fixes - -- [I66PE6] fix AssignSub primitive abnormal input leads to coredump. - -- [I6F5E6] fix data_sink function timeout on Ascend. - -### Others - -- Windows support is still being optimized,this version does not support now.It will be available for download in version 2.0. - -### Contributors - -Thanks goes to these wonderful people: - -alashkari,anzhengqi,archer2049,B.L.LAN,baihuawei,bichaoyang,BJ-WANG,Bokai Li,Brian-K,caifubi,caiyimeng,cathwong,changzherui,ChenDonYY,chenfei_mindspore,chengang,chengbin,chenhaozhe,chenjianping,chenkang,chenweifeng,chuht,chujinjin,davidanugraha,DavidFFFan,DeshiChen,douzhixing,emmmmtang,Erpim,Ethan,fangwenyi,fangzehua,fangzhou0329,fary86,fengyixing,gaoshuanglong,Gaoxiong,gaoyong10,gengdongjie,gongdaguo1,Greatpan,GuoZhibin,guozhijian,hangq,hanhuifeng,haozhang,hedongdong,Henry Shi,heterogeneous_to_backoff_2_0,huangbingjian,huanghui,huangxinjing,hujiahui8,hujingsong,huoxinyou,jachua,jiahongQian,jianghui58,jiangzhenguang,jiaorui,jiaoy1224,jijiarong,jjfeing,JoeyLin,json,JuiceZ,jxl,kairui_kou,KevinYi,kisnwang,KXiong,laiyongqiang,lanzhineng,liangchenghui,liangzelang,LiangZhibo,lianliguang,lichen,ligan,lijunbin,limingqi107,ling,linqingke,liubuyu,liuchao,liuchuting,liujunzhu,liuluobin,liutongtong9,liuyang811,lixiao,liyan2022,liyejun,liyuxia,looop5,luochao60,luojianing,luoyang,luoyuan,lyqlola,maning202007,maoyaomin,Margaret_wangrui,mayadong,MaZhiming,melody,mengyuanli,michaelzhu_70ab,Mohammad Motallebi,moran,NaCN,nomindcarry,OwenSec,panfengfeng,panshaowu,panzhihui,pkuliuliu,qinzheng,qiuzhongya,qujianwei,r1chardf1d0,Renyuan Zhang,RobinGrosman,shaojunsong,shenwei41,Soaringfish,tangdezhi_123,tanghuikang,tan-wei-cheng,TinaMengtingZhang,TronZhang,TuDouNi,VectorSL,wang_ziqi,wanghenchang,wangnan39,wangpingan,wangshaocong,wangshengnan123,wangtongyu6,weichaoran,wind-zyx,wqx,wtcheng,wujueying,wYann,XianglongZeng,xiaohanzhang,xiaotianci,xiaoyao,XinDu,xulei,xumengjuan1,xupan,xwkgch,yanghaoran,yangluhang,yangruoqi713,yangshuo,yangsijia,yangzhenzhang,yanzhenxiang2020,Yanzhi_YI,yao_yf,yefeng,yeyunpeng2020,Yi_zhang95,yide12,YijieChen,YingLai Lin,YingtongHu,youshu,yuchaojie,yuedongli,YuJianfeng,zangqx,ZengZitao,zhangbuxue,zhangdanyang,zhangdong,zhangfanghe,zhangqi,zhangqinghua,zhangyanhui,zhangyinxia,zhangyongxian,zhangzhaoju,zhanzhan,zhengzuohe,ZhidanLiu,zhixinaa,zhoufeng,zhouyaqiang0,zhuguodong,zhupuxu,zhuyuxiao,zichun_ye,zjun,zlq2020,zong_shuai,ZPaC,zuochuanyong,zyli2020,陈宇,范吉斌,冯一航,胡彬,宦晓玲,黄勇,雷元哲,李良灿,李林杰,刘崇鸣,刘力力,刘勇琪,吕浩宇,吕昱峰(Nate.River),没有窗户的小巷,沈竞兴,十六夜,王程浩,王禹程,王振邦,徐安越,徐永飞,杨旭华,于振华,俞涵,张清华,张澍坤,张栩浩,张学同,赵英灼,周超,周洪叶,朱家兴 - -Contributions of any kind are welcome! - -## MindSpore Lite 2.0.0-rc1 Release Notes - -### Major Features and Improvements - -#### MindSpore Lite Cloud Inference - -The original MindSpore Lite is mainly used for edge devices such as mobile phones and head units. Cloud inference is added to support scenarios with multiple backend hardware resources on the cloud, supports Ascend and NVIDIA GPU inference cards, and efficiently utilizes multi-core resources on the cloud. - -The original cloud inference integrated through MindSpore training can be changed to MindSpore Lite. For details, see [Quick Start to Cloud-side Inference](https://mindspore.cn/lite/docs/en/r2.0/quick_start/one_hour_introduction_cloud.html). To retain the original integration method, see [Inference](https://mindspore.cn/docs/en/r2.0/faq/inference.html). - -- [STABLE] Support MindIR model files. -- [STABLE] Third-party Onnx, TensorFlow, and Caffe models can be converted to MindIR model files using the MindSpore Lite conversion tool. -- [STABLE] One release package supports multiple hardware backends: Ascend, NVIDIA GPU, CPU. -- [STABLE] Supports the `Model` interface and `ModelParallelRunner` concurrent inference interface. -- [STABLE] Supports C++, Python, and Java inference interfaces. - -#### API - -- Due to the defects of the original Python API that many configuration parameters and complex usage, the usability of The Python APIs are optimized in version 2.0. The optimizations include class construction methods and class attribute adjustment. In addition, the Python APIs in version 2.0 and later will be integrated into the cloud-side inference scenario, which are incompatible with Python APIs of the earlier versions. For details, see [Python API](https://www.mindspore.cn/lite/api/en/r2.0/mindspore_lite.html). - -## MindSpore 2.0.0-alpha Release Notes - -### Major Features and Improvements - -#### PyNative - -- The default mode of MindSpore is switched to PyNative. If you want to manually set the mode, please refer to [Computational Graph](https://www.mindspore.cn/tutorials/en/r2.0.0-alpha/advanced/compute_graph.html). -- Support dynamic shape without padding, three networks are supported as demos: Transformer-GPU, YOLOV5-GPU, ASR-Ascend. Transformer-GPU and YOLOV5-GPU can be downloaded from [models](https://gitee.com/mindspore/models/tree/dynamic_shape). Only the following operators are available on Ascend backend: Add、Assign、BatchMatMul、BiasAdd、BiasAddGrad、Cast、Conv2D、Conv2DBackpropFilter、Conv2DBackpropInput、CTCLoss、Div、Dropout、DropoutDoMask、Equal、ExpandDims、Gather、GetNext、LayerNorm、LayerNormGrad、LessEqual、Load、Log、LogicalAnd、LogicalNot、LogicalOr、LogSoftmax、LogSoftmaxGrad、MatMul、Maximum、Mul、Neg、NotEqual、NPUAllocFloatStatus、NPUClearFloatStatus、OneHot、RealDiv、Reciprocal、ReduceMean、ReduceSum、ReLU、ReluGrad、Reshape、Select、Softmax、StridedSlice、Sub、Tile、Transpose、UnsortedSegmentSum、ZerosLike. The remaining operators have not been fully verified, please use them as appropriate. - -#### DataSet - -- The TFRecordDataset API can directly read TFRecord files compressed by GZIP or ZLIB. -- The NumpySlicesDataset API can process data of different dimensions at the same time. -- Optimize the structure of error log to display more clear call stack information for debugging. -- Fixed `mindspore.dataset.config.set_seed` does not take effect for random seeds in distributed training scenarios. - -#### AutoParallel - -- Supports more operators with distributed implements. - - Element Wise Operators:AddN, BitwiseAnd, BitwiseOr, BitwiseXor, CumProd, HShrink, HSigmoid, IsFinite, Mish, MulNoNan, Rint, SeLU, SoftShrink, TruncateDiv, TruncateMod, Xdivy Xlogy, InplaceAdd, InplacSub, InplaceUpdate, Cdist, L2Loss, Lerp. - - Math Operators:SquaredDifference, Erfinv, MaskedFill, SplitV, Gamma, KLDivLoss, LinSpace. - - Scatter Operators:ScatterAdd,ScatterDiv,ScatterMax,ScatterMul,ScatterNdAdd,ScatterNdSub,ScatterNdUpdate,ScatterSub,TensorScatterAdd,TensorScatterDiv,TensorScatterMax,TensorScatterMax,TensorScatterMul,TensorScatterAdd,TensorScatterUpdate. - -- Add new apis `transform_checkpoints` and `transform_checkpoint_by_rank` to transfer the distributed checkpoint files by strategy files. Please refer to [Distributed Resilience Training and Inference](https://www.mindspore.cn/tutorials/experts/en/r2.0.0-alpha/parallel/resilience_train_and_predict.html). - -### API Change - -#### operator - -- [STABLE] Add operator primitive for `mindspore.ops.AdaptiveMaxPool3D`. -- [STABLE] Add operator primitive for `mindspore.ops.AdjustHue`. -- [STABLE] Add operator primitive for `mindspore.ops.BartlettWindow`. -- [STABLE] Add operator primitive for `mindspore.ops.BesselJ0`. -- [STABLE] Add operator primitive for `mindspore.ops.BesselJ1`. -- [STABLE] Add operator primitive for `mindspore.ops.BesselK0`. -- [STABLE] Add operator primitive for `mindspore.ops.BesselK0e`. -- [STABLE] Add operator primitive for `mindspore.ops.BesselK1`. -- [STABLE] Add operator primitive for `mindspore.ops.BesselK1e`. -- [STABLE] Add operator primitive for `mindspore.ops.BesselY0`. -- [STABLE] Add operator primitive for `mindspore.ops.BesselY1`. -- [STABLE] Add operator primitive for `mindspore.ops.Betainc`. -- [STABLE] Add operator primitive for `mindspore.ops.Bincount`. -- [STABLE] Add operator primitive for `mindspore.ops.BlackmanWindow`. -- [STABLE] Add operator primitive for `mindspore.ops.Bucketize`. -- [STABLE] Add operator primitive for `mindspore.ops.CombinedNonMaxSuppression`. -- [STABLE] Add operator primitive for `mindspore.ops.CompareAndBitpack`. -- [STABLE] Add operator primitive for `mindspore.ops.Complex`. -- [STABLE] Add operator primitive for `mindspore.ops.DataFormatVecPermute`. -- [STABLE] Add operator primitive for `mindspore.ops.EuclideanNorm`. -- [STABLE] Add operator primitive for `mindspore.ops.Expand`. -- [STABLE] Add operator primitive for `mindspore.ops.ExtractGlimpse`. -- [STABLE] Add operator primitive for `mindspore.ops.FillDiagonal`. -- [STABLE] Add operator primitive for `mindspore.ops.FractionalAvgPool`. -- [STABLE] Add operator primitive for `mindspore.ops.FractionalMaxPool`. -- [STABLE] Add operator primitive for `mindspore.ops.Gcd`. -- [STABLE] Add operator primitive for `mindspore.ops.HammingWindow`. -- [STABLE] Add operator primitive for `mindspore.ops.Histogram`. -- [STABLE] Add operator primitive for `mindspore.ops.HSVToRGB`. -- [STABLE] Add operator primitive for `mindspore.ops.Lcm`. -- [STABLE] Add operator primitive for `mindspore.ops.LeftShift`. -- [STABLE] Add operator primitive for `mindspore.ops.ListDiff`. -- [STABLE] Add operator primitive for `mindspore.ops.LogSpace`. -- [STABLE] Add operator primitive for `mindspore.ops.Lstsq`. -- [STABLE] Add operator primitive for `mindspore.ops.MatrixDiagPartV3`. -- [STABLE] Add operator primitive for `mindspore.ops.MatrixDiagV3`. -- [STABLE] Add operator primitive for `mindspore.ops.MatrixExp`. -- [STABLE] Add operator primitive for `mindspore.ops.MatrixPower`. -- [STABLE] Add operator primitive for `mindspore.ops.MaxPool3DWithArgmax`. -- [STABLE] Add operator primitive for `mindspore.ops.MaxUnpool2D`. -- [STABLE] Add operator primitive for `mindspore.ops.MultilabelMarginLoss`. -- [STABLE] Add operator primitive for `mindspore.ops.NextAfter`. -- [STABLE] Add operator primitive for `mindspore.ops.Orgqr`. -- [STABLE] Add operator primitive for `mindspore.ops.ReduceStd`. -- [STABLE] Add operator primitive for `mindspore.ops.RGBToHSV`. -- [STABLE] Add operator primitive for `mindspore.ops.RightShift`. -- [STABLE] Add operator primitive for `mindspore.ops.SampleDistortedBoundingBoxV2`. -- [STABLE] Add operator primitive for `mindspore.ops.ScaleAndTranslate`. -- [STABLE] Add operator primitive for `mindspore.ops.ScatterAddWithAxis`. -- [STABLE] Add operator primitive for `mindspore.ops.ScatterNdDiv`. -- [STABLE] Add operator primitive for `mindspore.ops.ScatterNdMax`. -- [STABLE] Add operator primitive for `mindspore.ops.ScatterNdMul`. -- [STABLE] Add operator primitive for `mindspore.ops.STFT`. -- [STABLE] Add operator primitive for `mindspore.ops.Trace`. -- [STABLE] Add operator primitive for `mindspore.ops.UpsampleNearest3D`. -- [STABLE] Add operator primitive for `mindspore.ops.UpsampleTrilinear3D`. -- [STABLE] Add distributed weight conversion interface `mindspore.parallel.transform_checkpoints`. -- [STABLE] Add distributed weight conversion interface `mindspore.parallel.transform_checkpoint_by_rank`. - -#### Backwards Incompatible Change - -##### Python API - -- The `mindspore.ms_function` interface is renamed to `mindspore.jit`, and `mindspore.ms_function` will be deprecated and removed in a future version. -- The `mindspore.ms_class` interface is renamed to `mindspore.jit_class`, and `mindspore.ms_class` will be deprecated and removed in a future version. -- The `mindspore.ops.ms_kernel` interface is renamed to `mindspore.ops.kernel`, and `mindspore.ops.ms_kernel` will be deprecated and removed in a future version. -- The `mindspore.dataset.map` interface parameter `column_order` does not take effect, use`mindspore.dataset.project`. -- The `mindspore.dataset.close_pool` and `mindspore.dataset.to_device` and `mindspore.dataset.set_dynamic_columns` are deprecated and removed in this version. - -### Bug fixes - -- Fixed an issue where the mixed precision functional interface could not modify the backend driver in graph mode -- Fixed the problem that users can automatically transfer device_id in the single-P scenario for the following networks:(mobilenetv1/fasterrcnn/yolov3/yolov4/yolov5/unet/openpose/simplepose/crnn/gnmtv2/faceattribute/facequality/facedetection) - -### Contributors - -Thanks goes to these wonderful people: - -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking, shu-kun-zhang. - -Contributions of any kind are welcome! - -## MindSpore 1.10.1 Release Notes - -### Bug fixes - -- Fixed the issue that the specified axis is not considered in logsumexp anti-overflow processing -- Fixed the compilation dependency of proto file -- Fixed the issue that the print operator printing result is not normal -- Fixed the issue that the equal operator is out of range -- Fixed the problem that when function wrapped by @jit,the cell id is not correct -- Fixed the GNN scenario data type verification error -- Fixed the problem that the dataset.map multi-process degenerates into threads - -### Contributors - -Thanks goes to these wonderful people: - -archer2049, caifubi, chenfei_mindspore, gaoshuanglong, Greatpan, guozhijian, huoxinyou, Kxiong, lanzhineng, lijunbin, liubuyu, liuchuting, luochao60, lyqlola, nomindcarry, TuDouNi, xiaotianci, xupan, yangshuo, yefeng, YingtongHu, yuchaojie, zhoufeng, ZPaC, 刘勇琪, 吕昱峰, 王禹程, 于振华. - -Contributions of any kind are welcome! - -## MindSpore 1.10.0 Release Notes - -### Major Features and Improvements - -#### DataSet - -- [STABLE]The timeout waiting time is adjusted in data sinking mode. The default value is 600s after adjusted. This solves the isuses that the GetNext operator may timeout due to environment resource competition and large computing workload when training in sink mode. - -### Bug fixes - -- Fixed an issue where some Primitive operators in AMP cannot be instantiated in graph mode and the interface is unavailable. -- Fixed an issue of DynamicRNN execution failure in LSTM network under the scenario of computational force segmentation on Ascend platform. -- Fixed DEVICE_ID cannot be set by single card train scripts parameters in mobilenet, fasterrcnn, yolo, etc. - -### Contributors - -Thanks goes to these wonderful people: - -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking, shu-kun-zhang. - -Contributions of any kind are welcome! - -## MindSpore Lite 1.10.0 Release Notes - -### Bug fixes - -- Fixed potential accuracy problem of arithmetic type CPU kernels at dynamical shape case. -- Fixed the Incorrect Write Address of the Deconv Quantization Operator. - -## MindSpore 1.9.0 Release Notes - -### Major Features and Improvements - -#### FrontEnd - -- [STABLE] Add the object-oriented and functional combination programming paradigm, add mixed-precision APIs for combination programming paradigms such as `mindspore.amp.LossScaler`, `mindspore.amp.DynamicLossScaler`, `mindspore.amp.StaticLossScaler`, `mindspore.amp.auto_mixed_precision` and `mindspore.amp.all_finite`. - -### API Change - -#### operator - -- [STABLE] Add nn interface for `nn.AdaptiveAvgPool3d`. -- [STABLE] Add functional interface for `ops.adaptive_avg_pool3d`. -- [STABLE] Add functional interface for `ops.addcdiv`. -- [STABLE] Add functional interface for `ops.addcmul`. -- [STABLE] Add GPU and CPU support for `ops.approximate_equal`. -- [STABLE] Add GPU support for `ops.atanh`. -- [STABLE] Add GPU support for `ops.bessel_i0`. -- [STABLE] Add Ascend support for `ops.bessel_i0e`. -- [STABLE] Add GPU support for `ops.bessel_i1`. -- [STABLE] Add Ascend and GPU support for `ops.bessel_i1e`. -- [STABLE] Add GPU support for `ops.bessel_j0`. -- [STABLE] Add GPU support for `ops.bessel_j1`. -- [STABLE] Add GPU support for `ops.bessel_k0`. -- [STABLE] Add GPU support for `ops.bessel_k0e`. -- [STABLE] Add GPU support for `ops.bessel_k1`. -- [STABLE] Add GPU support for `ops.bessel_k1e`. -- [STABLE] Add GPU support for `ops.bessel_y0`. -- [STABLE] Add GPU support for `ops.bessel_y1`. -- [STABLE] Add functional interface for `ops.bias_add`. -- [STABLE] Add GPU support for `ops.bitwise_and`. -- [STABLE] Add GPU support for `ops.bitwise_or`. -- [STABLE] Add GPU support for `ops.bitwise_xor`. -- [STABLE] Add Ascend support for `ops.grid_sample`. -- [STABLE] Add CPU support for `ops.inplace_update`. -- [STABLE] Add Ascend and GPU support for `ops.isclose`. -- [STABLE] Add Ascend support for `ops.isnan`. -- [STABLE] Add GPU support for `ops.lerp`. -- [STABLE] Add functional interface for `ops.random_poisson`. -- [STABLE] Add functional interface for `ops.reverse_sequence`. -- [STABLE] Add GPU support for `ops.scatter_mul`. -- [STABLE] Add functional interface for `ops.scatter_nd_max`. -- [STABLE] Add functional interface for `ops.scatter_nd_min`. -- [STABLE] Add GPU support for `ops.SparseToDense`. -- [STABLE] Add functional interface for `ops.square`. -- [STABLE] Add GPU support for `ops.standard_laplace`. -- [STABLE] Add functional interface for `ops.std`. -- [STABLE] Add Ascend and GPU support for `ops.trunc`. -- [STABLE] Add functional interface for `ops.unsorted_segment_sum`. -- [STABLE] Add functional interface for `ops.xdivy`. -- [STABLE] Add GPU support for `ops.xlogy`. -- Deprecate `ops.poisson` and use `ops.random_poisson` instead. -- Deprecate `ops.SparseApplyAdagrad` and use `ops.SparseApplyAdagradV2` instead. - -### Bug fixes - -- [BUGFIX] The logic of the auto mixed precision (amp) O2 level is revised. In addition to the `BatchNorm1d` and `BatchNorm2d` operators, the other two operators `BatchNorm3d` and `LayerNorm` are added. The four operators still use the float32 data type when calculating. - -- [BUGFIX] Fix the problem that when processing string type data, if `output_numpy=True` is specified when calling the `create_dict_iterator` or `create_tuple_iterator` interface, the obtained data will be of type `numpy.bytes_`. After this fixing, these interfaces will directly return `numpy.str_` type data, and users do not need to perform string decoding operations on it. Likewise, when performing user defined processing functions, the received data will also be of type `numpy.str_` directly, matching the original source data type. - -### Contributors - -Thanks goes to these wonderful people: - -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, liyanliu, lizhenyu, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, panfengfeng, panyifeng, Payne, peixu_ren, Pengyongrong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanyuan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking, shu-kun-zhang. - -Contributions of any kind are welcome! - -## MindSpore 1.8.1 Release Notes - -### API Change - -#### operator - -- [STABLE] Add GPU and CPU support for ops.ApplyAdagradDA. -- [STABLE] Add CPU support for ops.ApplyAdagradV2. -- [STABLE] Add Ascend dynamic shape support for ops.ApplyCenteredRmsProp. -- [STABLE] Add CPU support for ops.ApplyFtrl. -- [STABLE] Add CPU support for ops.ApplyGradientDescent. -- [STABLE] Add CPU support for ops.ApplyPowerSign. -- [STABLE] Add GPU and CPU support for ops.ApplyProximalAdagrad. -- [STABLE] Add Ascend dynamic shape support for ops.ApplyRmsProp. -- [STABLE] Add functional interface for ops.max. -- [STABLE] Add functional interface for ops.atan2. -- [STABLE] Add GPU support for ops.cummax. -- [STABLE] Add GPU and CPU support for ops.cummin. -- [STABLE] Add GPU support for ops.diag. -- [STABLE] Add functional interface for ops.expand_dims. -- [STABLE] Add functional interface for ops.gather_elements. -- [STABLE] Add GPU support for ops.grid_sample. -- [STABLE] Add Ascend support for ops.hardswish. -- [BETA] Add GPU support for ops.index_fill. -- [BETA] Add CPU support for ops.inplace_update. -- [BETA] Add GPU support for nn.InstanceNorm1d. -- [BETA] Add GPU support for nn.InstanceNorm2d. -- [BETA] Add GPU support for nn.InstanceNorm3d. -- [STABLE] Add functional interface for ops.log1p. -- [STABLE] Add GPU and CPU support for ops.masked_fill. -- [BETA] Add GPU support for ops.matrix_diag_part. -- [BETA] Add GPU support for ops.matrix_diag. -- [BETA] Add GPU support for ops.matrix_set_diag. -- [STABLE] Add GPU support for ops.max_pool3d. -- [STABLE] Add functional interface for ops.nll_loss. -- [STABLE] Add functional interface for ops.one_hot. -- [STABLE] Add functional interface for ops.pad. -- [STABLE] Add CPU support for ops.random_gamma. -- [STABLE] Add functional interface for ops.amax. -- [STABLE] Add functional interface for ops.mean. -- [STABLE] Add functional interface for ops.amin. -- [STABLE] Add functional interface for ops.prod. -- [STABLE] Add Ascend, GPU, and CPU support for ops.renorm. -- [BETA] Add Ascend, GPU, and CPU support for ops.tensor_scatter_elements. -- [STABLE] Add GPU support for ops.scatter_max. -- [STABLE] Add GPU support for ops.scatter_min. -- [STABLE] Add functional interface for ops.scatter_nd. -- [STABLE] Add GPU support for ops.scatter_nd_max. -- [STABLE] Add functional interface for ops.scatter_update. -- [STABLE] Add CPU support for ops.binary_cross_entropy_with_logits. -- [STABLE] Add functional interface for ops.smooth_l1_loss. -- [STABLE] Add CPU support for ops.space_to_batch_nd. -- [STABLE] Add GPU and CPU support for ops.SparseApplyAdagrad. -- [STABLE] Add GPU and CPU support for ops.sparse_segment_mean. -- [STABLE] Add functional interface for ops.squeeze. -- [STABLE] Add CPU support for ops.standard_laplace. -- [BETA] Add Ascend, GPU, and CPU support for nn.ReflectionPad1d. -- [BETA] Add Ascend, GPU, and CPU support for nn.ReflectionPad2d. -- [STABLE] Add Ascend, GPU, and CPU support for nn.SiLU. -- [STABLE] Add functional interface for ops.transpose. -- [STABLE] Add CPU support for ops.uniform_candidate_sampler. -- [STABLE] Add functional interface for ops.uniform. -- [STABLE] Add GPU support for ops.unique_with_pad. -- [STABLE] Add functional interface for ops.unstack. -- [BETA] Add GPU and CPU support for ops.interpolate. -- [STABLE] Add CPU support for ops.xdivy. -- [STABLE] Add CPU support for ops.xlogy. - -## MindSpore 1.8.0 Release Notes - -### Major Features and Improvements - -#### FrontEnd - -- [BETA] Add `mindspore.train.Model.fit` API, add `mindspore.train.callback.EarlyStopping` and `mindspore.train.callback.ReduceLROnPlateau` in Callback. -- [BETA] Support custom operator implemented by Julia. -- [BETA] Support custom operator implemented by MindSpore Hybrid DSL. -- [STABLE] The export() interface supports the export of a model using a custom encryption algorithm, and the load() interface supports the import of a model using a custom decryption algorithm. -- [BETA] [Unified_Dynamic_and_Static_Graphs] [Usability] Constant-type data (tuple/list/dict is supported in Version 1.8) can be set to be variable during graph compiling. -- [BETA] [Unified_Dynamic_and_Static_Graphs] JIT fallback is used to support the control flow capability in the constant scenario. -- [STABLE] [Unified_Dynamic_and_Static_Graphs] The Python raise statement is supported in the graph mode constant scenario. -- [STABLE] [Unified_Dynamic_and_Static_Graphs] The Python assert statement is supported in the graph mode constant scenario. -- [STABLE] [Unified_Dynamic_and_Static_Graphs] The Python print statement is supported in the graph mode constant scenario. -- [STABLE] [Unified_Dynamic_and_Static_Graphs] The str.format() method is supported in the graph mode. -- [STABLE] [Unified_Dynamic_and_Static_Graphs] The slice method can be used to assign a value to the list in the graph mode. -- [STABLE] [Unified_Dynamic_and_Static_Graphs] The instances of custom classes can be created and invoked in the graph mode. -- [STABLE] [Unified_Dynamic_and_Static_Graphs] Obtaining the properties of a class from the Cell array and the custom class array is supported. -- [STABLE] [Unified_Dynamic_and_Static_Graphs] isinstance supports scenario expanding in the graph mode. -- [STABLE] Rename the custom operator decorator 'ms_hybrid' to 'ms_kernel'. -- [BETA] Custom operator Hybrid DSL is supported on the backend of CPU. -- [BETA] Custom operator Ascend backend adds custom scheduling primitive syntax support. - -#### PyNative - -- [STABLE] Implement the AdamWeightDecay operator to replace the original small operator combination mode. -- [STABLE] In PyNative mode, execute the optimizer by unifying the dynamic and static graphs. -- [STABLE] Optimize the execution performance of PyNative bprop graph and ms_function. - -#### Auto Parallel - -- [STABLE] Docking the AllToAll single-operator mode. Support AllToAll Operator in the graph compilation level O0. -- [STABLE] Whole-graph offloading supports MPI launching. In Whole-graph offloading, launching with MPI is supported. -- [STABLE] Seeds of model weights provide parallel interface configuration. If you do not set the random number of seeds through the mindspore.set_seed command, the weights initialized by each parameter is determined by the current fragment index. If the random number of seeds are configured, the initialization results of the same shape and weight of the same segmentation policy are the same. -- [STABLE] The HCCL shields internal full-mesh and non-full-mesh connections. Both fully-connected AllToAllv and hierarchical AllToAllv are allowed in one training session. -- [BETA] CPU optimizer fusion. Multiple optimizer operators are combined according to data types through cross-parameter fusion, improving performance. Currently, It has been verified on CPU AdamWeightDecay optimizer. You can use the flatten_weights method in the network cell class to enable this function. - -#### Executor - -- [STABLE] Provide southbound API. -- [STABLE] Multi-actor fusion execution to optimize the execution performance during runtime. -- [STABLE] Nopop operators (eg. reshape) execute elimination. -- [STABLE] Embedded cache architecture switches unified distributed runtime. -- [STABLE] Parameter Server switches unified distributed runtime. -- [STABLE] Support Parameter Server mode training on CPU. - -#### DataSet - -- [STABLE] When using the map operation for dataset objects and the parameters like: num_parallel_workers > 1 and python_multiprocessing=True, the multi-process mechanism is optimized, so that the data channel and child processes are mapped one by one, avoiding excessive file handle occupation, and closing_pool interface is also deleted. -- [STABLE] Add a batch of Vision, Text and Audio data augmentation operations. -- [STABLE] Fix a bug where the flat_map method of the Dataset class does not flatten the result. -- [STABLE] Unify import paths of dataset augmentation APIs to provide more easier way to use. Refer to [latest api usages](https://www.mindspore.cn/docs/en/r1.8/api_python/mindspore.dataset.vision.html). - -### API Change - -#### operator - -- [STABLE] Add GPU support for ops.adaptive_avg_pool2d. -- [BETA] Add Ascend, GPU, and CPU support for ops.adaptive_max_pool2d . -- [BETA] Add CPU support for ops.approximate_equal. -- [STABLE] Add CPU support for ops.argmin. -- [BETA] Add CPU support for ops.assign_sub. -- [STABLE] Add GPU support for ops.bernoulli. -- [BETA] Add CPU support for ops.bessel_i0. -- [BETA] Add CPU support for ops.bessel_i0e. -- [BETA] Add CPU support for ops.bessel_i1. -- [BETA] Add CPU support for ops.bessel_i1e Add CPU support. -- [STABLE] Add CPU support for ops.bessel_j0. -- [STABLE] Add CPU support for ops.bessel_j1. -- [STABLE] Add CPU support for ops.bessel_k0. -- [STABLE] Add CPU support for ops.bessel_k0e. -- [BETA] Add CPU support for ops.bessel_k1. -- [BETA] Add CPU support for ops.bessel_k1e. -- [STABLE] Add CPU support for ops.bessel_y0. -- [STABLE] Add CPU support for ops.bessel_y1. -- [STABLE] Add CPU support for ops.bitwise_and. -- [STABLE] Add CPU support for ops.bitwise_or. -- [STABLE] Add CPU support for ops.bitwise_xor. -- [STABLE] Add functional interface for ops.broadcast_to. -- [BETA] Add GPU and CPU support for ops.ceil. -- [BETA] Add GPU support for ops.col2im. -- [BETA] Add functional interface for ops.concat. -- [STABLE] Add GPU support for ops.cosh. -- [STABLE] Add Ascend and CPU support for ops.ctc_greedy_decoder. -- [BETA] Add GPU and CPU support for ops.DataFormatDimMap. -- [BETA] Add GPU and CPU support for ops.dropout2d. -- [BETA] Add CPU support for ops.dropout3d. -- [BETA] Add CPU support for ops.erf. -- [BETA] Add CPU support for ops.erfc. -- [STABLE] Add functional interface for ops.expand_dims. -- [STABLE] Add GPU and CPU support for ops.fast_gelu. -- [STABLE] Add Ascend dynamic shape support for ops.flatten. -- [BETA] Add GPU and CPU support for ops.ger. -- [STABLE] Add Ascend, GPU, and CPU support for ops.gumbel_softmax. -- [BETA] Add GPU and CPU support for ops.hardshrink. -- [BETA] Add CPU support for ops.index_add. -- [BETA] Add CPU support for ops.inplace_add. -- [BETA] Add CPU support for ops.inplace_sub. -- [STABLE] Add CPU support for ops.intopk. -- [STABLE] Add GPU and CPU support for ops.inv. -- [STABLE] Add GPU and CPU support for ops.invert. -- [BETA] Add CPU support for ops.isclose. -- [STABLE] Add CPU support for ops.lerp. -- [BETA] Add CPU support for ops.linspace. -- [BETA] Add functional interface for ops.log_softmax. -- [BETA] Add Ascend, GPU, and CPU support for ops.norm. -- [BETA] Add CPU support for ops.lrn. -- [BETA] Add GPU support for ops.masked_select. -- [BETA] Add GPU and CPU support for ops.matrix_band_part. -- [BETA] Add GPU and CPU support for ops.matrix_solve. -- [BETA] Add CPU support for ops.meshgrid. -- [STABLE] Add CPU support for ops.mish. -- [BETA] Add GPU support forops.nonzero. -- [STABLE] Add GPU and CPU support for ops.padding. -- [BETA] Add Ascend dynamic shape support for ops.pow. -- [BETA] Add functional interface for ops.range. -- [BETA] Add Ascend dynamic shape support for ops.round. -- [STABLE] Add Ascend dynamic shape support for ops.scatter_add. -- [STABLE] Add Ascend dynamic shape support for ops.scatter_div. -- [BETA] Add GPU support for ops.scatter_max. -- [BETA] Add GPU support for ops.scatter_min. -- [BETA] Add CPU support for ops.scatter_nd_add. -- [STABLE] Add GPU and CPU support for ops.scatter_nd_div. -- [STABLE] Add GPU and CPU support for ops.scatter_nd_min. -- [STABLE] Add GPU and CPU support for ops.scatter_nd_mul. -- [BETA] Add CPU support for ops.scatter_nd_sub. -- [STABLE] Add Ascend dynamic shape support for ops.scatter_update. -- [BETA] Add Ascend dynamic shape support for ops.select. -- [BETA] Add GPU and CPU support for ops.selu. -- [BETA] Add GPU and CPU support for ops.soft_shrink. -- [BETA] Add CPU support for ops.softsign. -- [STABLE] Add GPU support for ops.tan. -- [BETA] Add Ascend and CPU support ops.tensor_scatter_add. -- [STABLE] Add GPU and CPU support for ops.tensor_scatter_div. -- [STABLE] Add GPU and CPU support for ops.tensor_scatter_mul. -- [BETA] Add Ascend and CPU support for ops.tensor_scatter_sub. -- [STABLE] Add Ascend, GPU, and CPU support for nn.AdaptiveAvgPool1d. -- [STABLE] Add Ascend, GPU, and CPU support for nn.AdaptiveMaxPool1d. -- [BETA] Add Ascend, GPU, and CPU support for nn.BiDense. -- [STABLE] Add Ascend, GPU, and CPU support for nn.ConstantPad1d. -- [STABLE] Add Ascend, GPU, and CPU support for nn.ConstantPad2d. -- [STABLE] Add Ascend, GPU, and CPU support for nn.ConstantPad3d. -- [STABLE] Add Ascend, GPU, and CPU support for nn.Hardtanh. -- [STABLE] Add Ascend, GPU, and CPU support for nn.HuberLoss. -- [STABLE] Add Ascend, GPU, and CPU support for nn.RReLU. -- [STABLE] Add Ascend, GPU, and CPU support for nn.Tanhshrink. -- [STABLE] Add Ascend, GPU, and CPU support for nn.Threshold. -- [STABLE] Add Ascend, GPU, and CPU support for nn.ZeroPad2d. -- [BETA] Add GPU support for ops.unique_consecutive. -- [STABLE] Add CPU support for ops.unsorted_segment_max. -- [STABLE] Add CPU support for ops.unsorted_segment_min. -- [STABLE] Add GPU support for ops.unsorted_segment_prod. - -#### Backwards Incompatible Change - -##### Python API - -- DVPP simulation algorithm is no longer supported. Remove `mindspore.dataset.vision.c_transforms.SoftDvppDecodeRandomCropResizeJpeg` and `mindspore.dataset.vision.c_transforms.SoftDvppDecodeResizeJpeg` interfaces. -- Add `on_train_epoch_end` method in LossMonitor, which implements printing metric information in the epoch level when it is used in `mindspore.train.Model.fit`. -- TimeMonitor printing content changes, and the printed content is added to "train" or "eval" to distinguish between training and inference phases. -- `filter_prefix` of `mindspore.load_checkpoint` interface: empty string ("") is no longer supported, and the matching rules are changed from strong matching to fuzzy matching. - -#### Import Optimization - -APIs in `mindspore.context`, `mindspore.parallel`, `mindspore.profiler` and `mindspore.train` can be directly used in `mindspore`. The original usage can still be supported. - -For examples: - -- `mindspore.context.set_context` can be simplified to `mindspore.set_context`. -- `mindspore.parallel.set_algo_parameters` can be simplified to `mindspore.set_algo_parameters`. -- `mindspore.profiler.Profiler` can be simplified to `mindspore.Profiler`. -- `mindspore.train.callback.Callback` can be simplified to `mindspore.train.Callback`. - -The API pages are aggregated to . - -### Contributors - -Thanks goes to these wonderful people: - -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking, shu-kun-zhang. - -Contributions of any kind are welcome! - -## MindSpore Lite 1.8.0 Release Notes - -### Major Features and Improvements - -#### API - -- [STABLE] Add C++ and Python APIs for model conversion. -- [STABLE] Add Python APIs for model inference. - -#### Post-Training Quantization - -- [STABLE] Support perlayer quantization, and built-in CLE to optimize perlayer quantization accuracy. - -## MindSpore 1.7.0 Release Notes - -### Major Features and Improvements - -#### OS - -- [STABLE] Support Python 3.8 (Linux/Windows/Mac). -- [STABLE] Installation improved with more detailed install guide and automated shell scripts. -- [STABLE] Support operator computing with multi-thread under Windows. -- [STABLE] Compatible with GCC from version 7.3 to 9.x. - -#### FrontEnd - -- [STABLE] Support dynamic weight decay for optimizers, that is weight decay value will change according to the increasing step during training. -- [STABLE] Add four methods to create Tensor, which are `mindspore.numpy.rand()`, `mindspore.numpy.randn()`, `mindspore.numpy.randint()`, and `mindspore.ops.arange()`. -- [STABLE] Add `mindspore.train.callback.History` in Callback. -- [BETA] Support custom operator implemented by Julia operator. -- [STABLE] Support accessing attributes and methods of user-defined classes through `mindspore.ms_class` class decorator. -- [STABLE] Support training when a network has side effect operations and control flow statements at the same time. -- [STABLE] Support for more complex control flow syntax, such as a for loop statement in the body of a while loop. -- [STABLE] Improve the performance of networks with complex syntax control flow statements by decreasing the num of subgraphs. - -#### PyNative - -- [STABLE] Add Hook functions in PyNative mode, including register_forward_pre_hook, register_forward_hook of the forward hook interface, register_backward_hook of the reverse hook interface. -- [STABLE] Optimize the execution performance of PyNative mode, and execute the front-end Python and the back-end C++ in parallel. - -#### Auto Parallel - -- [STABLE] Support TopK routing, data parallel and optimizer state parallel when enable MoE. -- [STABLE] Support AllGather/ReduceScatter communication operator fusion. Support AllReuduce fusion by the data volume size in DATA_PARALLEL mode. -- [STABLE] Support ops.clip_by_global_norm in the parallel mode. -- [STABLE] Support AdaSum optimizer in the parallel mode. -- [STABLE] Support automatic optimizer state parallel. -- [STABLE] Support AlltoAll configurable. Support automatically add virtualdataset cell. -- [STABLE] Support automatically inference trainable parameters in pipeline parallel training. -- [STABLE] Support clusters where the device number is not the power of 2. -- [STABLE] Support sharding propagation in auto-parallel mode. -- [STABLE] Support optimizer offload under the unified runtime. -- [STABLE] Support Adafactor operator on CPU. -- [STABLE] Support sharding at H/W axis for Conv2d/Conv2DTranspose operator. Support operators such as ResizeBilinear,ROIAlign, CropAndResize, BoundingBoxEncode, IOU and RandomChoiceWithMask. - -#### Executor - -- [BETA] [Failure Recovery Under Data Parallel Training](https://www.mindspore.cn/tutorials/experts/en/r1.7/parallel/train_gpu.html) Support auto failure recovery under data parallel training mode. -- [BETA] Support searching for the number of threads under the CPU to obtain the optimal number of threads for execution. The entire search process takes 50 steps, and the overall performance will reach a stable state after 50 steps. When testing performance, data after 50 steps need to be used as a standard. - -#### DataSet - -- [STABLE] Add dataset operations mapping between TensorFlow.data module and MindSpore.dataset module, [check list](https://www.mindspore.cn/docs/en/r1.7/note/api_mapping/tensorflow_api_mapping.html#tf-data). -- [STABLE] Python multiprocessing optimization and make processes exit normally. -- [STABLE] Support [Dataset Autotune](https://www.mindspore.cn/tutorials/experts/en/r1.7/debug/dataset_autotune.html) for tuning the speed of dataset pipeline automatically. -- [BETA] [Dataset Offload](https://www.mindspore.cn/docs/en/r1.7/design/dataset_offload.html) support new data augmentation operations: RandomColorAdjust, RandomSharpness, TypeCast. -- Output a single data column when `__getitem__/__next__` methods of GeneratorDataset return a single NumPy object. -- Use `ulimit -u 10240` to increase the number of threads/processes available to the current user when specify too many processes or threads for loading dataset may cause RuntimeError: can't start new thread. - -### API Change - -#### Backwards Incompatible Change - -##### Python API - -- Modify the gradient return value type of the hook corresponding to the register_backward_hook function, and change the gradient return value to the tuple type uniformly.([!31876](https://gitee.com/mindspore/mindspore/pulls/31876)) -- Deprecated usage: `import mindspore.dataset.engine.datasets as ds`. Use `import mindspore.dataset as ds` instead as recommended in [mindspore doc](https://www.mindspore.cn/docs/en/r1.7/api_python/mindspore.dataset.html). -- Add `mindspore.ms_class` interface, as class decorator for user-defined classes. It allows MindSpore to identify user-defined classes and access their attributes and methods([!30855](https://gitee.com/mindspore/mindspore/pulls/30855)) -- Deprecate `mindspore.SparseTensor` and use `mindspore.COOTensor` instead. ([!28505](https://gitee.com/mindspore/mindspore/pulls/28505)) -- Add Tensor init arg `internal` for internal use. - -### Contributors - -Thanks goes to these wonderful people: - -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. - -Contributions of any kind are welcome! - -## MindSpore Lite 1.7.0 Release Notes - -### Major Features and Improvements - -#### Post quantization - -- [STABLE] Support post quantization to run dynamic quantization algorithm. -- [BETA] Support post quantized model to run on NVIDIA GPU. - -# MindSpore 1.6.0 - -## MindSpore 1.6.0 Release Notes - -### Major Features and Improvements - -#### OS - -- [STABLE] Support macOS with CPU(X86) -- [BETA] Supoport macOS with CPU(M1) - -#### FrontEnd - -- [STABLE] Support JIT Fallback feature in Graph mode. -- [STABLE] Support compile cache feature in Graph mode. -- [STABLE] Add new optimizers, including ASGD and Rprop. -- [STABLE] Add new initializers, including Identity, Orthogonal, Dirac, Sparse and VarianceScaling. -- [STABLE] Support resuming training when an exception occurs in the process. -- [STABLE] Change `mindspore.nn.LSTMCell` from single-layer LSTM to single-cell LSTM. -- [BETA] Introduce `mindspore.ops.Custom` to customize your own operators for Ascend(AICore, AICPU), GPU, CPU backends, and the custom type can be one of TBE, AKG, pure Python function or prebuild binary(called aot operator). - -#### PyNative - -- [STABLE] Support heterogeneous feature in PyNative mode. -- [STABLE] Optimize memory allocation in PyNative mode. - -#### Auto Parallel - -- [STABLE] Support configuring the output shard strategy of the MatMul distributed operator. -- [STABLE] Support multi-instances parallel. -- [STABLE] Support activation slice communication and calculation overlap in Transformer. -- [STABLE] Support heterogeneous parallel tensor swap. -- [STABLE] Add implementations of distributed operator of ResizeNearestNeighbor. -- [STABLE] Add a communication operator named NeighborExchangeV2 that supports data exchange between adjacent 8 rank ids. -- [STABLE] Pipeline parallel support GPU platform. -- [STABLE] Add cell-level data parallel interface. -- [STABLE] Support gradient AllReduce fusion according to the amount of data. -- [STABLE] Support a sharding strategy search algorithm called sharding propagation. - -#### Executor - -- [STABLE] Support multigraph sink and subgraph sink of MindRT. -- [STABLE] Support memory swap to break the device memory size limit on Ascend platform. -- [STABLE] Support dynamic deployment of distributed training cluster(GPU). -- [BETA] Support automatic failover of parameter server. - -#### DataSet - -- [STABLE] Support overwrite feature in MindRecord. -- [STABLE] Log improvement and more friendly to users. -- [BETA] Support new feature [Dataset Offload](https://www.mindspore.cn/docs/programming_guide/en/r1.6/enable_dataset_offload.html) to speed up data processing by heterogeneous computing. -- [BETA] Support new feature [Dataset Autotune](https://www.mindspore.cn/docs/programming_guide/en/r1.6/enable_auto_tune.html) to adjust parallelism of dataset pipeline automatically. - -#### GraphKernel Fusion - -- [STABLE] Support kernel fusion and generation for CPU backend. - -#### Federated Learning - -- [STABLE] FL-Client framework and model decoupling. -- [BETA] Support Cross-silo federated learning framework. - -#### Debug - -- [STABLE] Support dump in cell level(Ascend). -- [STABLE] Support dump Tensor statistics(Ascend/GPU). -- [STABLE] Support displaying corresponding code lines for fusion nodes. -- [STABLE] Support passing dump flag in Ascend backend in order to dump correct operators after fusion transformation. - -### API Change - -#### Backwards Incompatible Change - -##### Python API - -###### `mindspore.dataset.MindDataset` interface changes input parameter dataset_file([!27542](https://gitee.com/mindspore/mindspore/pulls/27542)) - -`MindDataset` contains the input parameter `dataset_file`, which is in the singular format. It can receive a single file path or a list that stores multiple file paths. Thus It is preferred to change the input parameter `dataset_file` into plural format. In addition, the input parameters of most dataset API, such as `TFRecordDataset`, are in plural formart (`dataset_files`). To ensure consistency, the input parameter `dataset_file` of MindDataset is changed to plural formart as `dataset_files`, we can see the updated version in api of [mindspore.dataset.MindDataset](https://www.mindspore.cn/docs/en/master/api_python/dataset/mindspore.dataset.MindDataset.html#mindspore.dataset.MindDataset). - -###### Delete `mindspore.Tensor`'s property `virtual_flag`([!26989](https://gitee.com/mindspore/mindspore/pulls/26989)) - -###### Delete `mindspore.Parameter`'s property `is_init`([!26989](https://gitee.com/mindspore/mindspore/pulls/26989)) - -###### Delete `mindspore.nn.ROC`'s interface `roc`([!25713](https://gitee.com/mindspore/mindspore/pulls/25713)) - -###### The `shard()` interface of primitive is changed from `shard(strategy)` to `shard(in_strategy=None, out_strategy=None)` - -###### The `set_auto_parallel_context()` interface of context is changed from - -###### `set_auto_parallel_context(parallel_mode=AUTO_PARALLEL, auto_parallel_search_mode="dynamic_programming")` to `set_auto_parallel_context(parallel_mode=AUTO_PARALLEL, search_mode="dynamic_programming")` - -#### Collect Data and Create Landscape - -##### Python API - -###### `mindspore.train.callback.SummaryCollector` interface's parameter `collect_specified_data` add new operations `collect_landscape` ([!26229](https://gitee.com/mindspore/mindspore/pulls/26229)) - -`collect_landscape` can collect the parameters needed to create the loss landscape. we can see the updated version in api of [mindspore.train.callback.SummaryCollector](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.SummaryCollector.html#mindspore.SummaryCollector). - -###### `mindspore.train.callback` add new interface `SummaryLandscape` ([!26229](https://gitee.com/mindspore/mindspore/pulls/26229)) - -`SummaryLandscape` can help you to collect loss landscape information. It can create landscape in PCA direction or random direction by calculating loss. We can see the updated version in api of [mindspore.train.callback.SummaryLandscape](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.SummaryLandscape.html#mindspore.SummaryLandscape). - -### Bug fixes - -#### Executor - -- Fix process hanging while calling MPI_comm_create in asymmetric pipeline split scenario. ([!28707](https://gitee.com/mindspore/mindspore/pulls/28707)) -- Fix the execution error when the weights are shared between graph mode and PyNative mode.([!26635](https://gitee.com/mindspore/mindspore/pulls/26635)) -- Fixed the probability coredump when free memory under PyNative mode.([!25472](https://gitee.com/mindspore/mindspore/pulls/25472)) - -#### Dataset - -- Fix memory increase abnormally when running dataset for a long time. ([!26237](https://gitee.com/mindspore/mindspore/pulls/26237)) -- Fix saving MindRecord files with Chinese path on Windows. ([!28378](https://gitee.com/mindspore/mindspore/pulls/28378)) - -## MindSpore Lite - -### Major Features and Improvements - -#### Converter and runtime - -- [STABLE] Add more fusion patterns in the converter tool to improve runtime performance. -- [STABLE] Support take OpenGL texture as input and output of inference. -- [STABLE] Refactor the JAVA API. -- [BETA] Support inference on Ascend310. - -#### x86 backend optimization - -- [STABLE] Optimize kernels for x86 using Advanced Vector Extensions(AVX512). - -#### ARM backend optimization - -- [STABLE] Support heterogeneous parallel inference, including splitting operators, constructing heterogeneous subgraphs, and heterogeneous parallel scheduling between CPUs and GPUs. -- [STABLE] Add more FP16 operators. - -#### Post quantization - -- [STABLE] Post quantization supports debugging. -- [STABLE] Full quantization supports choosing non-quantized nodes. -- [STABLE] Mixed bit quantization supports auto-tune. - -#### Training on Device - -- [STABLE] Support user-defined algorithm models to access the federated learning framework. - -### Contributors - -Thanks goes to these wonderful people: - -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, [wangnan39@huawei.com](mailto:wangnan39@huawei.com), wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, [zhanghaibo5@huawei.com](mailto:zhanghaibo5@huawei.com), zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. - -Contributions of any kind are welcome! - -# MindSpore 1.5.2 - -## MindSpore 1.5.2 Release Notes - -### Bug fixes - -- Fix code specification, pclint, codedex alarm. -- Repair NN Abnormal output of graphnorm operator. -- Fixed the problem of poor performance in scenes with dynamic rnngrad batch size of 16 times. - -### Contributors - -Thanks goes to these wonderful people: - -Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, Zhenglong Li, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. - -Contributions of any kind are welcome! - -# MindSpore 1.5.1 - -## MindSpore 1.5.1 Release Notes - -### Bug fixes - -- Fix code specification, pclint, codedex alarm. -- Fix yolov4 network probabilistic segment error. - -### Contributors - -Thanks goes to these wonderful people: - -Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, Zhenglong Li, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. - -Contributions of any kind are welcome! - -# MindSpore 1.5.0 - -## MindSpore 1.5.0 Release Notes - -### Major Features and Improvements - -#### NewModels - -- [STABLE] Add CV model on Ascend: Fast-SCNN -- [BETA] Add CV models on Ascend: midas_V2, attgan, FairMOT, CenterNet_resnet101, SEResNext, YOLOV3-tiny, RetinaFace -- [STABLE] Add CV models on GPU: ssd_mobilenetv1_fpn, shufflenetv1, tinyDarkNet, CNN-CTC, unet++, DeepText, SqueezeNet -- [STABLE] Add NLP models on GPU: GRU, GNMT2, Bert-Squad -- [STABLE] Add recommend models on GPU: NCF -- [BETA] Add CV models on GPU: FaceAttribute, FaceDetection, FaceRecongnition SENet, -- [BETA] Add Audio models on GPU: DeepSpeech2 -- [STABLE]`model_zoo` has been separated to an individual repository`models` - -#### FrontEnd - -- [STABLE] Support`while` and`break`,`continue` statements of training network in`GRAPH_MODE`. -- [BETA] Support export MindIR file after model training in cloud side and evaluate in edge side by import the MindIR file. -- [STABLE] Support forward mode auto-diff interface Jvp(Jacobian-Vector-Product). -- [STABLE] Support backward mode auto-diff interface Vjp(Vector-Jacobian-Product). - -#### Auto Parallel - -- [STABLE] Support distributed pipeline inference. -- [STABLE] Add implementation of the sparse attention and its distributed operator. -- [STABLE] Add implementations of distributed operator of Conv2d/Conv2dTranspose/Conv2dBackpropInput/Maxpool/Avgpool/Batchnorm/Gatherd. -- [STABLE] Support configuring the dataset strategy on distributed training and inference mode. -- [STABLE] Add high level API of the Transformer module. - -#### Executor - -- [STABLE] Support AlltoAll operator. -- [STABLE] CPU operator (Adam) performance optimization increased by 50%. -- [BETA] Support Adam offload feature, reduce the static memory usage of Pangu large model by 50%. -- [STABLE] MindSpore Ascend backend supports configuration operator generation and loading cache path. -- [STABLE] MindSpore Ascend backend supports lazy build in PyNaitve mode and compilation performance improved by 10 times. -- [STABLE] The function or Cell decorated by ms_function supports gradient calculation in PyNative mode. -- [STABLE] The outermost network supports parameters of non tensor type in PyNative mode. - -#### DataSet - -- [BETA] Add a new method for class Model to support auto data preprocessing in scenario of Ascend 310 inference. -- [STABLE] Add a new drawing tool to visualize detection/segmentation datasets. -- [STABLE] Support a new tensor operation named ConvertColor to support color space transform of images. -- [STABLE] Enhance the following tensor operations to handle multiple columns simultaneously: RandomCrop, RandomHorizontalFlip, RandomResize, RandomResizedCrop, RandomVerticalFlip. -- [STABLE] Support electromagnetic simulation dataset loading and data augmentation. -- [STABLE] Optimize the error logs of Dataset to make them more friendly to users. - -#### Federated Learning - -- [STABLE] Change the deployment environment of FL-Client. - -#### Running Data Recorder - -- [STABLE] RDR saves collected data files within directories named by Rank ID on distributed training on Ascend, GPU and CPU. - -#### GraphKernel Fusion - -### API Change - -#### Backwards Incompatible Change - -##### Python API - -###### New Recomputation Configuration for AutoParallel and SemiAutoParallel Scenarios - -Configuring the recomputation of the communication operations generated by the model parallel and optimizer parallel to save the memory on the -devices. Users can pass `mp_comm_recompute` and `parallel_optimizer_comm_recompute` to enable the recomputation of the communication operations. - -### Bug fixes - -#### FrontEnd - -- Fix bug of too many subgraphs when network include`for` statement.([!23669](https://gitee.com/mindspore/mindspore/pulls/23669)) - -#### Executor - -- RunTask failed when parameter_broadcast is enabled in PyNative mode. ([!23255](https://gitee.com/mindspore/mindspore/pulls/23255)) -- An illegal memory access was encountered in the dynamic shape net on GPU. -- Fix tune failed for DynamicRnn. ([!21081](https://gitee.com/mindspore/mindspore/pulls/21081)) - -#### Dataset - -- Optimize thread monitoring to solve the problem of running multiple multiprocessesing on Windwos. ([!23232](https://gitee.com/mindspore/mindspore/pulls/23232)) -- Fix bugs of Dataset tensor operations in lite mode. ([!21999](https://gitee.com/mindspore/mindspore/pulls/21999)) -- Fix memory increasing when using create_dict_iterator in for loop. ([!22529](https://gitee.com/mindspore/mindspore/pulls/22529))([!22529](https://gitee.com/mindspore/mindspore/pulls/22529)) - -## MindSpore Lite - -### Major Features and Improvements - -#### Converter and runtime - -1. Optimize TDNN-like streaming model by reusing the result of last inference. -2. Support dynamic filter Convolution. -3. Support serializing float32 weight into float16 weight for reducing size of model file. -4. Provide unified runtime API for developer reusing their code between cloud side and end side. -5. Now developer can configure built-in pass as custom passes. -6. Now user can specify format and shape of model inputs while converting model. -7. Support multiple devices inference, includeing CPU, NPU, GPU. User can set devices in mindspore::Context. -8. Support mixed precision inference. User can set inference precision by LoadConfig API. -9. Support custom operator registration and enable inference on third-party hardware. - -#### ARM backend optimization - -1. Support the nchw data format of some Operators, such as Conv, InstanceNorm, etc. The performance of some models convertered from onnx and caffe is greatly improved. -2. Fix bugs of memory leak on NPU. - -#### Post quantization - -1. Weight quantization supports mixed bit quantization. -2. Full quantization supports data pre-processing. -3. Adjust the quantization parameters from the command line to the configuration file. - -#### Training on Device - -1. Unify lite external api with MindSpore. -2. Implement static memory allocator and common workspace for TOD,save memory 10-20%. -3. Provide getgradients and setgradients interface,get and set optimizer params interfaces to support MOE Model. -4. Support user specified output node when export IOD Model. -5. Support more text networks (tinybert,albert) and operators. - -#### Codegen - -1. Support kernel register for custom op. Third-party hardware like NNIE can be accessed through it. - -### API Change - -#### API Incompatible Change - -##### C++ API - -### Contributors - -Thanks goes to these wonderful people: - -Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, Zhenglong Li, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. - -Contributions of any kind are welcome! - -# MindSpore 1.4.0 - -## MindSpore 1.4.0 Release Notes - -### Major Features and Improvements - -#### NewModels - -#### FrontEnd - -#### Auto Parallel - -- Add distributed operators: Conv2D/Conv2DTranspose/Conv2DBackpropInput/MaxPool/AvgPool/BatchNorm/GatherD -- Support to configure shard strategy for dataset - -#### Executor - -#### DataSet - -- Add SlicePatchesOperation for Remote Sensing feature([!18179](https://e.gitee.com/mind_spore/repos/mindspore/mindspore/pulls/18179)) - -#### FederatedLearning - -#### Running Data Recorder - -#### GraphKernel Fusion - -#### Profiler - -- [STABLE] Support MS_DIAGNOSTIC_DATA_PATH for profiler feature.(Ascend/GPU) - -#### Dump - -- [STABLE] Support MS_DIAGNOSTIC_DATA_PATH for dump feature.(Ascend/GPU/CPU) - -### API Change - -#### Backwards Incompatible Change - -##### Python API - -##### Command Line Interface - -###### Dump Config - -Previously, we need to set the dump path in dump config file. To make the dump feature easier to use on cloud, we support new environment parameter `MS_DIAGNOSTIC_DATA_PATH`. - -| 1.3.0 | 1.4.0 | -| ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------- | -| `path` is a mandatory field. | `path` field is optional. If `path` field is not provided or is empty string, `MS_DIAGNOSTIC_DATA_PATH` should be set in environment. | - -### Bug fixes - -#### FrontEnd - -#### Executor - -#### Dataset - -- Fix module 'signal' has no attribute 'SIGCHLD' problem under windows platform. ([!21232](https://gitee.com/mindspore/mindspore/pulls/21232)) - -## MindSpore Lite - -### Major Features and Improvements - -#### Converter and runtime - -#### x86 backend optimization - -#### ARM backend optimization - -#### Cuda backend optimization - -#### OpenCL backend - -#### Post quantization - -#### Training on Device - -#### Codegen - -### API Change - -#### API Incompatible Change - -##### C++ API - -#### New features - -##### Java API - -### Bug fixes - -#### Deprecations - -### Contributors - -Thanks goes to these wonderful people: - -Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, Zhenglong Li, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. - -Contributions of any kind are welcome! - -# MindSpore 1.3.0 - -## MindSpore 1.3.0 Release Notes - -### Major Features and Improvements - -#### NewModels - -- [STABLE] Add CV models on Ascend: CPM, FCN8s, SSD-ResNet50-FPN, EAST, AdvancedEast. -- [STABLE] Add NLP models on Ascend: DGU, TextCNN, SentimentNet(LSTM). -- [STABLE] Add CV models on GPU: Faster-RCNN, FCN8s, CycleGAN, AdvancedEast. -- [BETA] Add CV models on Ascend: CycleGAN, PoseNet, SimCLR. -- [BETA] Add NLP models on Ascend: DGU, EmoTect, Senta, KT-Net. -- [BETA] Add NLP models on GPU: DGU, EmoTect. -- [BETA] Add EPP-MVSNet: a novel deep learning network for 3D reconstruction from multi-view stereo, which has won the first place in Tanks & Temples leaderboard(until April 1, 2021)(GPU). - -#### FrontEnd - -- [STABLE] The default running mode of MindSpore is changed to Graph mode. -- [STABLE] Support interface `run_check` to check whether MindSpore is working properly or not. -- [STABLE] Support saving custom information in the checkpoint file. -- [STABLE] Normal class adds mean parameter. -- [STABLE] Support export YOLOv3-DarkNet53 and YOLOv4 ONNX model. -- [STABLE] Support 40+ operator export ONNX model. -- [STABLE] The Metric module supports `set_indexes` to select the inputs of `update` in the specified order. -- [STABLE] Switch `_Loss` to an external API `LossBase` as the base class of losses. - -#### Auto Parallel - -- [STABLE] Add distributed operators: Select/GatherNd/ScatterUpdate/TopK. -- [STABLE] Support basic pipeline parallelism. -- [STABLE] Optimize sharding strategy setting of `Gather`. -- [STABLE] Optimize mix precision and shared parameter scenarios. -- [STABLE] Optimize distributed prediction scenarios. - -#### Executor - -- [STABLE] Support unified runtime in GPU and CPU backend. -- [STABLE] MindSpore GPU support CUDA11 with cuDNN8. -- [STABLE] MindSpore GPU inference performance optimization by integrating TensorRT. -- [STABLE] MindSpore built on one Linux distribution can now be used on multiple Linux distributions with the same CPU architecture (e.g. EulerOS, Ubuntu, CentOS). -- [STABLE] MindSpore Ascend support group convolution. - -#### DataSet - -- [STABLE] Support caching over MindRecord dataset. -- [STABLE] Support new shuffle mode for MindRecord dataset. -- [STABLE] Support a cropper tool for MindSpore Lite to allow the user to customize MindData binary file according to their script. -- [STABLE] Support share memory mechanism to optimize the multi-processing efficiency of GeneratorDataset/Map/Batch. -- [STABLE] Add features for the GNN dataset to support molecular dynamics simulation scenarios. - -#### FederatedLearning - -- [STABLE] Support Cross-device federated learning framework. -- [STABLE] Support FL-Server distributed networking including TCP and HTTP communication. -- [STABLE] Support FL-Server distributed federated aggregation,support autoscaling and fault tolerance. -- [STABLE] Develop FL-Client framework. -- [STABLE] Supports local differential privacy algorithms. -- [STABLE] MPC-based security aggregation algorithm. -- [STABLE] MindSpore Lite Device-side Inference & Training Interconnection with FL-Client. - -#### Running Data Recorder - -- [STABLE] Provide records of multi-stage computational graphs, memory allocation information and graph execution order when a "Launch kernel failed" occurs. (CPU) - -#### GraphKernel Fusion - -- [STABLE] Add options to control the optimization level. -- [STABLE] Enhance the generalization ability on GPU. GraphKernel is enabled by default in 40+ networks which cover the field of NLP, CV, Recommender, NAS and Audio. The result shows their throughput is significantly improved, and you are Recommended enabling GraphKernel in your network. - -#### Debug - -- [STABLE] Unified dump function. - -### API Change - -#### Backwards Incompatible Change - -##### Python API - -###### `mindspore.dataset.Dataset.device_que` interface removes unused parameter `prefetch_size`([!18973](https://gitee.com/mindspore/mindspore/pulls/18973)) - -Previously, we have a parameter `prefetch_size` in `device_que` to define the prefetch number of records ahead of the user's request. But indeed this parameter is never used which means it is an ineffective parameter. Therefore, we remove this parameter in 1.3.0 and users can set this configuration by [mindspore.dataset.config.set_prefetch_size](https://www.mindspore.cn/docs/api/en/r1.3/api_python/mindspore.dataset.config.html#mindspore.dataset.config.set_prefetch_size). - - - - - - - - - -
1.2.1 1.3.0
- -```python -device_que(prefetch_size=None, send_epoch_end=True, create_data_info_queue=False) -``` - - - -```python -device_que(send_epoch_end=True, create_data_info_queue=False) -``` - -
- -###### `mindspore.nn.optim.thor` interface changes to lowercase `thor` and adds two parameters `enable_clip_grad` and `frequency`([!17212](https://gitee.com/mindspore/mindspore/pulls/17212)) - -The parameter `enable_clip_grad` is used for gradient clipping and another parameter `frequency` is used to control the update interval of second order information matrix. - - - - - - - - - -
1.2.1 1.3.0
- -```python -THOR(net, learning_rate, damping, momentum, weight_decay=0.0, loss_scale=1.0, batch_size=32, - use_nesterov=False, decay_filter=lambda x: x.name not in [], split_indices=None) -``` - - - -```python -thor(net, learning_rate, damping, momentum, weight_decay=0.0, loss_scale=1.0, batch_size=32, - use_nesterov=False, decay_filter=lambda x: x.name not in [], split_indices=None, enable_clip_grad=False, - frequency=100) -``` - -
- -##### Dump Config - -Previously, we could only dump tensor data for one or all steps. To make the dump feature easier to use, we changed the dump configuration format and dump structure. View the [New Dump Tutorial](https://www.mindspore.cn/docs/programming_guide/en/r1.3/dump_in_graph_mode.html). - -| 1.2.1 | 1.3.0 | -| ------------------------------------------------------ | ------------------------------------------------------------------------------------------- | -| `iteration` is an int. | `iteration` is a string. | -| `op_debug_mode` is in `async_dump_settings` field. | `op_debug_mode` is in `common_dump_settings` field. `async_dump_settings` is removed. | - -### Bug fixes - -#### FrontEnd - -- Fix exception when use import module in while body such as 'F.xxx'.([!17635](https://e.gitee.com/mind_spore/repos/mindspore/mindspore/pulls/17635)) -- Fix the exception of 'exceeding limit call depth' in compile graph process when using while expression with grad operation. ([!18662](https://e.gitee.com/mind_spore/repos/mindspore/mindspore/pulls/18662)) - -#### Executor - -- Fix reallocate memory bug for communication op.([!14492](https://gitee.com/mindspore/mindspore/pulls/14492)) -- Replace memcpy_async op with tensor_move op.([!15204](https://gitee.com/mindspore/mindspore/pulls/15204)) -- Fix the build error when multiple python versions are installed in the environment. ([!19165](https://gitee.com/mindspore/mindspore/pulls/19165)) -- The warning when the te/topi/hccl version does not match is optimized, and fix the repeated warning. ([!18704](https://gitee.com/mindspore/mindspore/pulls/18704)) -- Fix the error in a cluster with more than 8 pcs in pynative mode. ([!16376](https://gitee.com/mindspore/mindspore/pulls/16376)) -- Fix graph ring problem in UB fusion.([!16109](https://gitee.com/mindspore/mindspore/pulls/16109)) -- Fix AllGather op select problem when the shape is not divisible by 16. ([!18878](https://gitee.com/mindspore/mindspore/pulls/18878)) - -#### Dataset - -- Fix an out-of-memory error when ImagefolderDataset gets an illegal directory. ([!16196](https://gitee.com/mindspore/mindspore/pulls/16196)) -- Fix bugs of vision transformations in lite mode. ([!14722](https://gitee.com/mindspore/mindspore/pulls/14722),[!14774](https://gitee.com/mindspore/mindspore/pulls/14774),[!15050](https://gitee.com/mindspore/mindspore/pulls/15050)) -- Fix default numbers of parallel workers of MindData for those CPUs with fewer cores. ([!15921](https://gitee.com/mindspore/mindspore/pulls/15921)) -- Fix MindRecord writing failed probabilistically in multiprocessing. ([!15242](https://gitee.com/mindspore/mindspore/pulls/15242)) - -## MindSpore Lite - -### Major Features and Improvements - -#### Converter and runtime - -1. Support Caffe model running on Hi3516D. -2. Support delegate mechanism to run your models(part or whole) on user specified executor. -3. Support control flow models. -4. Support cross-compiling for iOS, so that we can inference models on iOS devices. - -#### x86 backend optimization - -1. Optimize kernels for x86 using Advanced Vector Extensions(AVX). - -#### ARM backend optimization - -1. Optimize fp16 kernels. -2. Support arm32 fp16 instruction acceleration on ARMv8.2. - -#### Cuda backend optimization - -1. Support NV GPU backend base on delegate mechanism(use TensorRT as delegate). - -#### OpenCL backend - -1. Optimize the strategy of workgroup and blocksize to improve performance. -2. Support OpenCL dynamic infershape. -3. Support INT32 type ops. - -#### Post quantization - -1. Support fp32 training model converts to quantization training model. - -#### Training on Device - -1. Support fp32 training model export to quantization model after training process end. -2. Unify APIs and output package name of training and inference. -3. Simplify implementation of Train Session. -4. Optimize train and infer compile, reduce libmindspore-lite-train.so memory. -5. Training memory optimization: memory reduce 10-50% compare with r1.2. -6. Training performance optimization: for 1*1 special input shape Cov2DGradInput and SparseSoftmaxCrossEntropyWithLogits operator optimization, improved 10%-20%. -7. Support more networks(transformer, albert). - -#### Codegen - -1. Support deployment on HarmonyOS for device. - -### API Change - -#### API Incompatible Change - -##### C++ API - -###### Unify LiteSession and TrainSession, Merge LiteSession And TrainSession.([!17356](https://gitee.com/mindspore/mindspore/pulls/17356)) - -Previously, Training on Device use TrainSession while Inference on Device use LiteSession. To simplify implementation, we move TrainSession functions to LiteSession as virtual function. and move APIs previous defined in train_session.h to lite_session.h. - -```cpp -class MS_API LiteSession { -... -static LiteSession *CreateTrainSession(const std::string &filename, const lite::Context *context, - bool train_mode = false, const lite::TrainCfg *cfg = nullptr); - static LiteSession *CreateTransferSession(const std::string &filename_backbone, const std::string &filename_head, - const lite::Context *context, bool train_mode = false, - const lite::TrainCfg *cfg = nullptr); -virtual int Train() { return mindspore::lite::RET_ERROR; } -virtual int Eval() { return mindspore::lite::RET_OK; } -virtual int SetupVirtualBatch(int virtual_batch_multiplier, float lr = -1.0f, float momentum = -1.0f) { - return mindspore::lite::RET_ERROR; - } -virtual std::vector GetPredictions() const { - std::vector outputs; - return outputs; - } -... -``` - -###### Add Export API for Training on device, obsolete SaveToFile API.([!17356](https://gitee.com/mindspore/mindspore/pulls/17356)) - -Previously, Training on Device uses SaveToFile API to save the training model to file. Export API was added in this release to support more format, more model type(train or interface part of the model), and save weight quant model of train. - -```cpp -virtual int Export(const std::string &file_name, lite::ModelType model_type = lite::MT_TRAIN, - lite::QuantizationType quant_type = lite::QT_DEFAULT, lite::FormatType = lite::FT_FLATBUFFERS) { - return mindspore::lite::RET_ERROR; - } -``` - -###### Add GetFeatureMaps and UpdateFeatureMaps interface for Training on device.([!18344](https://gitee.com/mindspore/mindspore/pulls/18344)) - -When Training on the device, we may need to update the model featuremap and get model featuremap.particularly in MindSpore Federated Scenario. - -```cpp -virtual std::vector GetFeatureMaps() const { - std::vector features; - return features; - } - virtual int UpdateFeatureMaps(const std::vector &features) { return mindspore::lite::RET_ERROR; } -``` - -#### New features - -##### Java API - -###### new static method for creating LiteSession by MSConifg in LiteSession.class - -Previously, if we want to create a LiteSession object, we need to call two APIs: - -```js -MSConfig config; -// config options ... -LiteSession liteSession = new LiteSession(); -boolean ret = liteSession.init(config); -if (!ret) { - // handle init LiteSession failed ... -} -``` - -now we can create a LiteSession object with new API just like: - -```js -MSConfig config; -// config options ... -LiteSession liteSession = createSession(config); -if (liteSession == null) { - // handle create LiteSession failed ... -} -``` - -###### new static method for creating LiteSession byModelBuffer and MSConfig in LiteSession.class - -Previously, if we want to inference a model, we need to call APIs like: - -```js -MSConfig config; -// config options ... -LiteSession liteSession = new LiteSession(); -boolean initSessionRet = liteSession.init(config); -if (!initSessionRet) { - // handle init LiteSession failed and return ... -} -Model model = new Model(); -boolean loadModelRet = model.loadModel(modelMappedByteBuffer); -if (!loadModelRet) { - // handle load model failed and return ... -} -boolean compileModelRet = liteSession.compileGraph(model); -if (!loadModelRet) { - // handle compile model failed and return ... -} -model.free(); -// liteSession is ready to inference model, call runGraph in LiteSession.class ... -``` - -now we can use new API just like: - -```js -MSConfig config; -// config options ... -LiteSession liteSession = createSession(modelMappedByteBuffer, config); -if (liteSession == null) { - // handle init LiteSession failed and return ... -} -// liteSession is ready to inference model, call runGraph in LiteSession.class ... -``` - -New createSession method is an API that integrates four old APIs: LiteSession.init, Model.loadModel, LiteSession.compileGraph and model.free. It is simple and efficient as it reduces one modelBuffer copy operation. - -###### new methods getFeaturesMap and updateFeatures for in LiteSession.class - -Recently, we add a new C++ api in LiteSession class, Correspondingly we add a new java API in LiteSession.java. - -```java -public List getFeaturesMap() { - List ret = this.getFeaturesMap(this.sessionPtr); - ArrayList tensors = new ArrayList(); - for (Long msTensorAddr : ret) { - MSTensor msTensor = new MSTensor(msTensorAddr); - tensors.add(msTensor); - } - return tensors; - } - public boolean updateFeatures(List features) { - long[] inputsArray = new long[features.size()]; - for (int i = 0; i < features.size(); i++) { - inputsArray[i] = features.get(i).getMSTensorPtr(); - } - return this.updateFeatures(this.sessionPtr, inputsArray); - } -``` - -###### new methods export to replace saveToFile API in LiteSession.class - -Recently, we add a new C++ api in LiteSession class, Correspondingly we add a new java API in LiteSession.java. - -```java -public boolean export(String modelFileName, int modelType, int quantizationType) { - return this.export(this.sessionPtr, modelFileName, modelType, quantizationType); - } -``` - -###### new train related API moved to LiteSession.class from TrainSession.class - -Align with update of C++ api in LiteSession class, add new java API to LiteSession.java Correspondingly. - -```java -public class LiteSession { -... -public static LiteSession createTrainSession(String modelName, final MSConfig config, boolean trainMode){...} -public boolean train() {...} -public boolean eval() {...} -... -``` - -### Bug fixes - -1. Fix the bug that the train session does not release memory cause of refcount bug. - -#### Deprecations - -### Contributors - -Thanks goes to these wonderful people: - -Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, Zhenglong Li, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. - -Contributions of any kind are welcome! - -# MindSpore 1.2.1 - -## MindSpore 1.2.1 Release Notes - -### Major Features and Improvements - -#### FrontEnd - -- [STABLE] Add MaskedSelect aicpu operation.(Ascend) - -#### Auto Parallel - -- [STABLE] Support distributed checkpoint loading.(Ascend/GPU) - -# MindSpore 1.2.0 - -## MindSpore 1.2.0 Release Notes - -### Major Features and Improvements - -#### NewModels - -- [STABLE] Add CV models on Ascend: 3D Unet, Unet++, SSD-Resnet50-fpn, SSD-VGG16, crnn_seq2seq_ocr for BSI, CTPN, resnet18, DPN -- [STABLE] Add CV models on GPU: Faster-RCNN -- [STABLE] Add NLP models on Ascend: NAML, Fasttext, GRU, LSTM -- [BETA] Add TPRR: Thinking Path Re-Ranker, an original ranked-base framework for Multi-Hop Question Answering which has won the first place in HotpotQA leaderboard.(Ascend) - -#### FrontEnd - -- [STABLE] Support side effects expression to ensure that the perform order of user's semantics is correct.(Ascend/GPU/CPU) -- [STABLE] Support calculating the gradient for network that contain non-Tensor input parameters(int, float, bool, mstype,int, mstype.float, mstype.uint, mstype.bool_, tuple, list, dict).(Ascend/GPU/CPU) -- [STABLE] Support the inverse of a bool Tensor.(Ascend/GPU/CPU) -- [STABLE] Uniform the interface `isinstance`.(Ascend/GPU/CPU) -- [STABLE] Support negative indexes.(Ascend/GPU/CPU) -- [STABLE] Support 110+ Numpy-like interfaces in mindspore.numpy.(Ascend/GPU/CPU) -- [STABLE] Support export/load mindir model with a size greater than 2 GB. -- [STABLE] The optimizer supports gradient centralization.(Ascend) -- [STABLE] Support support auc metric, rou metric, bleu score metric, confusion matrix metric, cosine similarity metric, dice metric, hausdorff distance metric, occlusion sensitivity metric, perplexity metric, mean surface distance metric, root mean surface distance metric. -- [STABLE] Support use EmbeddingLookup with cache.(Ascend) -- [STABLE] Add MaskedSelect aicpu operation.(Ascend) - -#### Auto Parallel - -- [STABLE] Support AllGather and ReduceScatter fusion.(Ascend) -- [STABLE] Support gradient accumulation feature in auto parallel mode.(Ascend/GPU) -- [STABLE] Support running parallel optimizer with gradient accumulation.(Ascend) -- [STABLE] Add the configuration of communication operators' fusion.(Ascend) -- [STABLE] Support distributed checkpoint loading.(Ascend/GPU) - -#### Executor - -- [STABLE] Support inference with Nvidia GPU. -- [STABLE] Support data parallelism in PyNative mode.(Ascend/GPU) -- [STABLE] Optimize LSTM inference memory consumption in Graph mode with CPU. - -#### Sponge - -- [STABLE] Add SPONGE modules for molecular dynamics simulation, including Bond, Angle, Dihedral, Non Bond 14, NeighborList, Particle Mesh Ewald, Langevin MD and LIUJIAN MD.(GPU) - -#### DataSet - -- [STABLE] If the libnuma library is installed in the environment, you can run `export DATASET_ENABLE_NUMA=True` or `export MS_ENABLE_NUMA=True` to configure NUMA binding. In multi-card training scenarios, the training data processing speed can be improved, thereby improving the network training efficiency. -- [STABLE] Unify API Tensor structure of Training/Inference interfaces in C++ SDK. -- [STABLE] Optimize duplicated Decode in data preprocess using cache, improve preprocess efficiency. -- [STABLE] Support eager mode to run data augmentation in Python & C++. -- [STABLE] Support more data augmentation operators(e.g. Affine, Perspective) in MindSpore-Lite. -- [STABLE] Support light pipeline to process MindData in MindSpore-Lite training. -- [STABLE] Support more data preprossing operators based on DVPP hardware module and can be used on on Ascend310 platform. -- [STABLE] Support copy-free property for data in Ascend310 inference process scenarios. - -#### Running Data Recorder - -- [STABLE] Support running data recorder (RDR) for exception demarcation. -- [STABLE] Provide records of multi-stage computational graphs, memory allocation information, graph execution order, stream execution order and task debug information when a "run task error" or "distribute task failed" occurs. (Ascend) -- [STABLE] Provide records of multi-stage computational graphs, memory allocation information and graph execution order when a "SyncStream error" occurs. (GPU) - -#### 3D Feature - -- [STABLE] Support 3D ops: Conv3D, Conv3DBackpropInput, Conv3DBackpropFilter, Conv3DTranspose, BiasAdd, BiasAddGrad, PReLU, Transpose, Reshape, transdata, StrideSlice, MaxPool3D, MaxPool3DGrad, BinaryCrossEntropy, SigmoidCrossEntropyWithLogits, SigmoidCrossEntropyWithLogitsGrad, SoftmaxCrossEntropyWithLogits, SigmoidCrossEntropyWithLogits, SigmoidCrossEntropyWithLogitsGrad, BatchNorm3d, BatchNorm3dGrad, Dropout3d. -- [STABLE] Support RMSELoss loss function, MAELoss loss function, FocalLoss loss function, DiceLoss binary loss function, and MultiClassDiceLoss multi-type loss function for 2D/3D network. -- [STABLE] Add optimizer: AdamApplyOne(3D), ApplyMomentum(3D), SGD(3D). - -### API Change - -#### Backwards Incompatible Change - -##### Python API - -###### `mindspore.numpy.array()`, `mindspore.numpy.asarray()`, `mindspore.numpy.asfarray()`, `mindspore.numpy.copy()` now support GRAPH mode, but cannot accept `numpy.ndarray` as input arguments anymore([!12726](https://gitee.com/mindspore/mindspore/pulls/12726)) - -Previously, these interfaces can accept numpy.ndarray as arguments and convert numpy.ndarray to Tensor, but cannot be used in GRAPH mode. -However, currently MindSpore Parser cannot parse numpy.ndarray in JIT-graph. To support these interfaces in graph mode, we have to remove `numpy.ndarray` support. With that being said, users can still use `Tensor` to convert `numpy.ndarray` to tensors. - - - - - - - - - -
1.1.1 1.2.0
- -```python ->>> import mindspore.numpy as mnp ->>> import numpy ->>> ->>> nd_array = numpy.array([1,2,3]) ->>> tensor = mnp.asarray(nd_array) # this line cannot be parsed in GRAPH mode -``` - - - -```python ->>> import mindspore.numpy as mnp ->>> import numpy ->>> ->>> tensor = mnp.asarray([1,2,3]) # this line can be parsed in GRAPH mode -``` - -
- -###### mindspore.numpy interfaces remove support for keyword arguments `out` and `where`([!12726](https://gitee.com/mindspore/mindspore/pulls/12726)) - -Previously, we have incomplete support for keyword arguments `out` and `where` in mindspore.numpy interfaces, however, the `out` argument is only functional when `where` argument is also provided, and `out` cannot be used to pass reference to numpy functions. Therefore, we have removed these two arguments to avoid any confusion users may have. Their original functionality can be found in [np.where](https://www.mindspore.cn/docs/en/master/api_python/numpy/mindspore.numpy.where.html#mindspore.numpy.where) - - - - - - - - - -
1.1.1 1.2.0
- -```python ->>> import mindspore.numpy as np ->>> ->>> a = np.ones((3,3)) ->>> b = np.ones((3,3)) ->>> out = np.zeros((3,3)) ->>> where = np.asarray([[True, False, True],[False, False, True],[True, True, True]]) ->>> res = np.add(a, b, out=out, where=where) # `out` cannot be used as a reference, therefore it is misleading -``` - - - -```python ->>> import mindspore.numpy as np ->>> ->>> a = np.ones((3,3)) ->>> b = np.ones((3,3)) ->>> out = np.zeros((3,3)) ->>> where = np.asarray([[True, False, True],[False, False, True],[True, True, True]]) ->>> res = np.add(a, b) ->>> out = np.where(where, x=res, y=out) # instead of np.add(a, b, out=out, where=where) -``` - -
- -###### Turn `ops.MakeRefKey` into an internal interface ([!12010](https://gitee.com/mindspore/mindspore/pulls/12010)) - -Previously MakeRefKey is an external interface that is not used, now make it an internal interface with the same usage. We do not recommend users to use this interface, and we will remove the relevant introduction of this interface from the official website. - -###### `ops.ApplyFtrl`, `ops.ApplyMomentum`, `ops.ApplyRMSProp`, `ops.ApplyCenteredRMSProp` change the output on Ascend backend from multiple to a single. ([!11895](https://gitee.com/mindspore/mindspore/pulls/11895)) - -Previously the number of outputs of these operator is different on different backends. To unify their definition we change their output on Ascend backend from multiple to a single. - -##### `P.FusedBatchNorm`, `P.FusedBatchNormEx` deleted ([!12115](https://gitee.com/mindspore/mindspore/pulls/12115)) - -The FusedBatchNorm and FusedBatchNormEx interface has been deleted. Please use the batchnorm operator to replace it. - -##### `MetaTensor` deleted ([!10325](https://gitee.com/mindspore/mindspore/pulls/10325)) - -The MetaTensor interface has been deleted. The function of MetaTensor has been integrated into tensor. - -###### `ControlDepend` is deleted, use `Depend` instead. The decorator `@C.add_flags(has_effect=True)` does not work. ([!13793](https://gitee.com/mindspore/mindspore/pulls/13793)) - -Previously, we used ControlDepend to control the execution order of multiple operators. In version 1.2.0, mindspore introduces the auto-monad side effects expression to ensure that the perform order of user's semantics is correct. Therefore, ControlDepend is deleted and Depend is recommended. - -In most scenarios, if operators have IO side effects (such as print) or memory side effects (such as assign), they will be executed according to the user's semantics. In some scenarios, if the two operators A and B have no order dependency, and A must be executed before B, we recommend using Depend to specify their execution order. See the API documentation of the Depend operator for specific usage. - - - - - - - - - -
1.1.1 1.2.0
- -```python - In some side-effect scenarios, we need to ensure the execution order of operators. - In order to ensure that operator A is executed before operator B, it is recommended - to insert the Depend operator between operators A and B. - - Previously, the ControlDepend operator was used to control the execution order. - Since the ControlDepend operator is deprecated from version 1.1, it is recommended - to use the Depend operator instead. The replacement method is as follows:: - - a = A(x) ---> a = A(x) - b = B(y) ---> y = Depend(y, a) - ControlDepend(a, b) ---> b = B(y) -``` - - - -```python - In most scenarios, if operators have IO side effects or memory side effects, - they will be executed according to the user's semantics. In some scenarios, - if the two operators A and B have no order dependency, and A must be executed - before B, we recommend using Depend to specify their execution order. The - usage method is as follows:: - - a = A(x) ---> a = A(x) - b = B(y) ---> y = Depend(y, a) - ---> b = B(y) -``` - -
- -After the introduction of the auto-monad side effect expression feature, the decorator `@C.add_flags(has_effect=True)` does not work. If the decorator is used in the script, please modify. Take the overflow identification operator (without side effects) as an example, the modification method is as follows: - - - - - - - - - -
1.1.1 1.2.0
- -```python -@C.add_flags(has_effect=True) -def construct(self, *inputs): - ... - loss = self.network(*inputs) - init = self.allo_status() - self.clear_status(init) - ... -``` - - - -```python -def construct(self, *inputs): - ... - loss = self.network(*inputs) - init = self.allo_status() - init = F.depend(init, loss) - clear_status = self.clear_status(init) - ... -``` - -
- -##### C++ API - -###### C++ API support dual ABI now.([!12432](https://gitee.com/mindspore/mindspore/pulls/12432)) - -1.1.1 supports only the old ABI. Currently, both the new and the old are supported. - - - - - - - - - -
1.1.1 1.2.0
- -```cmake -add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0) -``` - - - -```cmake -add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0) # old ABI are supported -add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=1) # new ABI are supprrted, too - # write nothing, use new ABI as default -``` - -
- -###### Context refactor.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) - -The `Context` class is refactored. For details, see the API docs. - - - - - - - - - -
1.1.1 1.2.0
- -```cpp -GlobalContext::SetGlobalDeviceTarget(kDeviceTypeAscend310); // set device target is ascend310 -GlobalContext::SetGlobalDeviceID(0); // set device id is 0 -auto model_context = std::make_shared(); // create a model context -ModelContext::SetInsertOpConfigPath(model_context, "./aipp.cfg") // set aipp config file is ./aipp.cfg -``` - - - -```cpp -auto model_context = std::make_shared(); // create a model context -auto ascend310_info = std::make_shared(); -model_context.MutableDeviceInfo().push_back(ascend310_info ); // set device target is ascend310 -ascend310_info->SetDeviceID(0); // set device id is 0 -ascend310_info->SetInsertOpConfigPath("./aipp.cfg"); // set aipp config file is ./aipp.cfg -``` - -
- -###### LoadModel interface changes.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) - -`LoadModel` is renamed `Load`. No exception is thrown new but the return status should be checked. - - - - - - - - - -
1.1.1 1.2.0
- -```cpp -try { - auto graph = Serialization::LoadModel(model_file_path, kMindIR); -} catch (...) { ... } -``` - - - -```cpp -Graph graph; -auto ret = Serialization::Load(model_file_path, kMindIR, &graph); -if (ret != kSuccess) { ... } -``` - -
- -###### Model ctor changes.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) - -`Model` uses a non-parameter ctor now, and arguments are passed in through `Build`. - - - - - - - - - -
1.1.1 1.2.0
- -```cpp -Model net(net_cell, model_context); -auto ret = net.Build(); -if (ret != kSuccess) { ... } -``` - - - -```cpp -Model net; -auto ret = net.Build(net_cell, model_context); -if (ret != kSuccess) { ... } -``` - -
- -###### MSTensor::CreateTensor returns a native pointer now.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) - -`MSTensor::CreateTensor` and `MSTensor::CreateRefTensor` returns a native pointer now, need to be destroy by `DestroyTensorPtr`. - - - - - - - - - -
1.1.1 1.2.0
- -```cpp -auto tensor = MSTensor::CreateTensor(xxx, xxx, ...); -auto name = tensor.Name(); -``` - - - -```cpp -auto tensor = MSTensor::CreateTensor(xxx, xxx, ...); -auto name = tensor->Name(); -MSTensor::DestroyTensorPtr(tensor); -``` - -
- -#### New features - -##### Python API - -- Add SPONGE functions: `mindspore.ops.operations.BondForceWithAtomEnergy`, `mindspore.ops.operations.AngleForceWithAtomEnergy`, `mindspore.ops.operations.DihedralForceWithAtomEnergy`, `mindspore.ops.operations.Dihedral14LJCFForceWithAtomEnergy`, `mindspore.ops.operations.LJForceWithPMEDirectForce`, `mindspore.ops.operations.PMEExcludedForce`, `mindspore.ops.operations.PMEReciprocalForce`,`mindspore.ops.operations.BondEnergy`, `mindspore.ops.operations.AngleEnergy`,`mindspore.ops.operations.DihedralEnergy`, `mindspore.ops.operations.Dihedral14LJEnergy`, `mindspore.ops.operations.Dihedral14CFEnergy`,`mindspore.ops.operations.LJEnergy`, `mindspore.ops.operations.PMEEnergy`. All operators are supported in `GPU`. - -#### Deprecations - -##### Python API - -###### `nn.MatMul` is now deprecated in favor of `ops.matmul` ([!12817](https://gitee.com/mindspore/mindspore/pulls/12817)) - -[ops.matmul](https://www.mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.matmul.html#mindspore.ops.matmul) follows the API of [numpy.matmul](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html) as closely as possible. As a function interface, [ops.matmul](https://www.mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.matmul.html#mindspore.ops.matmul) is applied without instantiation, as opposed to `nn.MatMul`, which should only be used as a class instance. - - - - - - - - - -
1.1.1 1.2.0
- -```python ->>> import numpy as np ->>> from mindspore import Tensor, nn ->>> ->>> x = Tensor(np.ones((2, 3)).astype(onp.float32) ->>> y = Tensor(np.ones((3, 4)).astype(onp.float32) ->>> nn.MatMul()(x, y) -``` - - - -```python ->>> import numpy as np ->>> from mindspore import Tensor, ops ->>> ->>> x = Tensor(np.ones((2, 3)).astype(onp.float32) ->>> y = Tensor(np.ones((3, 4)).astype(onp.float32) ->>> ops.matmul(x, y) -``` - -
- -### Bug fixes - -#### FrontEnd - -- fix the null pointer problem of evaluator in control flow.([!13312](https://gitee.com/mindspore/mindspore/pulls/13312)) -- fix parameter naming conflict bug for CellList and SequentialCell. ([!13260](https://gitee.com/mindspore/mindspore/pulls/13260)) - -#### Executor - -- fix executor pending task not execute in some heterogeneous cases.([!13465](https://gitee.com/mindspore/mindspore/pulls/13465)) -- add passes to support frontend IR unification, including following operations: SliceGrad([!11783](https://gitee.com/mindspore/mindspore/pulls/11783)), ApplyFtrl, ApplyMomentum, ApplyRMSProp, CenteredRMSProp([!11895](https://gitee.com/mindspore/mindspore/pulls/11895)), AvgPoolGrad([!12813](https://gitee.com/mindspore/mindspore/pulls/12813)), BatchNorm([!12115](https://gitee.com/mindspore/mindspore/pulls/12115)) - -#### Dataset - -- Fix getter functions(e.g. GetDatasetSize) terminated abnormally when use python multi-processing. ([!13571](https://gitee.com/mindspore/mindspore/pulls/13571), [!13823](https://gitee.com/mindspore/mindspore/pulls/13823)) -- Fix unclear error log of data augmentation operators. ([!12398](https://gitee.com/mindspore/mindspore/pulls/12398), [!12883](https://gitee.com/mindspore/mindspore/pulls/12883), [!13176](https://gitee.com/mindspore/mindspore/pulls/13176)) -- Fix profiling performs abnormally when sink_size = False, as saving data is later than profiling analysis. ([!13944](https://gitee.com/mindspore/mindspore/pulls/13944)) - -## MindSpore Lite - -### Major Features and Improvements - -#### Converter and runtime - -1. Support TensorFlow model in Converter except aware-training model. -2. Add fusion pattern for same horizontal operators in Converter. -3. Support Jar in x86_64 system for integrating into server with Java backend conveniently. -4. Provide unified runtime API for developer reusing their code between cloud side and end side.[BETA] -5. Improve control-flow capabilities continually: Support GRU fusion in Converter; Support weight-quant for control-flow model; Support control-flow model inference with half precision; Support nested control-flow model.[BETA] - -#### ARM backend optimization - -1. Add NLP dependent float16 operators(like lstm) to enhance inference performance. -2. Optimize operators: lstm, gru, depthwise. -3. Add 6 NPU operators(like FullConnection), and fix some bugs about buildIR failed. - -#### OpenCL backend - -1. Add new ops: add 10+ ops,total 72 ops; -2. Performance optimization: by memory layout optimize,block tiling,Performance improved by 30% compared to version 1.1 at Adreno GPU. -3. Initialization time optimization: initialization time improve 100% vs MSLITE Version1.1 by store kernel cache as binary. -4. Support Java call on Mali or Adreno GPU. - -#### Post quantization - -1. Support quantization of gather and lstm ops. -2. Support quantizatizing TF Lite models with sub-graph node. -3. Add quantiztion strategy to decide quantize ops or not,less accuracy loss and higher compression rate. - -#### Training on Device - -1. Virtual batching, use mini-batch to minic large batch in theorical with few RAM consumption. -2. Converter unify, do not compile tod and iod converter separately. -3. Performance optimization to BWD ops. -4. TrainLoop with Off-The-Shelf Functionality blocks, like LR scheduler, Loss Monitor, Ckpt Saver, Accuracy Monitor. -5. Integration of code with Minddata lite. -6. Support more networks (googlenet, densenet, shufflenetv2, nin, vgg) and operators. - -#### Codegen - -1. Support 79 ops for the ARM platform and all CMSIS ops for Arm Cortex-M Series. -2. Multiplatform support, including Android, IoT Devices. -3. Support offline model weight preprocessing while compiling. -4. Support offline memory reuse computing for minimum runtime buffer size. -5. Support kernel register for custom op. Third-party hardware like NNIE can be accessed through it. - -### API Change - -#### API Incompatible Change - -##### C++ API - -###### Add header file named lite_types.h for some common data structs. ([!12262](https://gitee.com/mindspore/mindspore/pulls/12262)) - -Previously, some common data structs such as `CpuBindMode` and `DeviceType` are in context.h, this may cause cross-dependency between headers. So we create a new header named lite_types.h for some common data structs and move `CpuBindMode` and `DeviceType` from context.h into lite_types.h. - - - - - - - - -
lite_types.h
- -```cpp -namespace mindspore::lite { -/// \brief CpuBindMode defined for holding bind cpu strategy argument. -typedef enum { - NO_BIND, /**< no bind */ - HIGHER_CPU, /**< bind higher cpu first */ - MID_CPU /**< bind middle cpu first */ -} CpuBindMode; - -/// \brief DeviceType defined for holding user's preferred backend. -typedef enum { - DT_CPU, /**< CPU device type */ - DT_GPU, /**< GPU device type */ - DT_NPU /**< NPU device type */ -} DeviceType; -} // namespace mindspore::lite -``` - -
- -###### Add some new interfaces in ms_tensor.h for unified runtime API.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) - -Previously, users could not create `MSTensor` or modify ``MSTensor, all `MSTensor` are created and managed by framework. However users need to create or modify MSTensor sometimes such as pre-processing input data. So we provide two new interfaces in ms_tensor.h: `CreateTensor` interface for creating `MSTensor` by user and `set_shape` interface for modifying the shape of `MSTensor`. - - - - - - - - -
CreateTensor
- -```cpp -/// \brief Create a MSTensor. -/// -/// \return Pointer to an instance of MindSpore Lite MSTensor. -static MSTensor *CreateTensor(const std::string &name, TypeId type, const std::vector &shape, const void *data, - size_t data_len); -``` - -
- - - - - - - - -
set_shape
- -```cpp -/// \brief Set the shape of MSTensor. -virtual void set_shape(const std::vector &shape) = 0; -``` - -
- -Previously, users could access to data of `MSTensor` by interface named `MutableData`. However `MutableData` is not only returning data of tensor but also allocating data for tensor if its data is nullptr. So we provide a new interfaces in ms_tensor.h named `data` for returning data of tensor without allocating automatically. - - - - - - - - -
data
- -```cpp -/// \brief Get the pointer of data in MSTensor. -/// -/// \note The data pointer can be used to both write and read data in MSTensor. No memory buffer will be -/// allocated. -/// -/// \return the pointer points to data in MSTensor. -virtual void *data() = 0; -``` - -
- -###### Delete `DimensionSize()` in ms_tensor.h.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) - -The interface named `DimensionSize` is fuinctionally overlapped with the interface named `shape`. For the simplicity of the interface, we delete `DimensionSize` and recommend users to use the new interface named `shape` instead. - - - - - - - - -
DimensionSize()
- -```cpp -/// \brief Get size of the dimension of the MindSpore Lite MSTensor index by the parameter index. -/// -/// \param[in] index Define index of dimension returned. -/// -/// \return Size of dimension of the MindSpore Lite MSTensor. -virtual int DimensionSize(size_t index) const = 0; -``` - -
- -###### Move allocator from namespace mindspore::lite to namespace lite for unified runtime API.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) - -Previously, class `Allocator` is in namespace mindspore::lite. Considering unified allocator interface for unified runtime API, we move `Allocator` to namespace mindspore. - - - - - - - - - -
1.1.0 1.2.0
- -```cpp -namespace mindspore::lite { -/// \brief Allocator defined a memory pool for malloc memory and free memory dynamically. -/// -/// \note List public class and interface for reference. -class Allocator; -} -``` - - - -```cpp -namespace mindspore { -/// \brief Allocator defined a memory pool for malloc memory and free memory dynamically. -/// -/// \note List public class and interface for reference. -class Allocator; -} -``` - -
- -### Bug fixes - -1. Fix the bug that the array in kernel registrar is not initialized. -2. Fix segment fault caused by releasing of OpParameter in Crop kernel in mistake. -3. Fix the bug that the MINDIR aware-training model is finally interpreted as weight-quant model. - -## Contributors - -Thanks goes to these wonderful people: - -Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, zymaa. - -Contributions of any kind are welcome! - -# MindSpore 1.1.1 Release Notes - -## MindSpore - -### Major Features and Improvements - -#### NewModels - -- [STABLE] BGCF: a Bayesian Graph Collaborative Filtering(BGCF) framework used to model the uncertainty in the user-item interaction graph and thus recommend accurate and diverse items on Amazon recommendation dataset.(Ascend) -- [STABLE] GRU: a recurrent neural network architecture like the LSTM(Long-Short Term Memory) on Multi30K dataset.(Ascend) -- [STABLE] FastText: a simple and efficient text classification algorithm on AG's news topic classification dataset, DBPedia Ontology classification dataset and Yelp Review Polarity dataset.(Ascend) -- [STABLE] LSTM: a recurrent neural network architecture used to learn word vectors for sentiment analysis on aclImdb_v1 dataset.(Ascend) -- [STABLE] SimplePoseNet: a convolution-based neural network for the task of human pose estimation and tracking on COCO2017 dataset.(Ascend) - -#### FrontEnd - -- [BETA] Support Tensor Fancy Index Getitem with tuple and list. (Ascend/GPU/CPU) - -### Backwards Incompatible Change - -#### Python API - -##### `ops.AvgPool`, `ops.MaxPool`, `ops.MaxPoolWithArgmax` change attr name from 'ksize', 'padding' to 'kernel_size', 'pad_mode' ([!11350](https://gitee.com/mindspore/mindspore/pulls/11350)) - -Previously the kernel size and pad mode attrs of pooling ops are named "ksize" and "padding", which is a little puzzling and inconsistent with convolution ops. So they are rename to "kernel_size" and "pad_mode". - - - - - - - - - -
1.1.0 1.1.1
- -```python ->>> import mindspore.ops as ops ->>> ->>> avg_pool = ops.AvgPool(ksize=2, padding='same') ->>> max_pool = ops.MaxPool(ksize=2, padding='same') ->>> max_pool_with_argmax = ops.MaxPoolWithArgmax(ksize=2, padding='same') -``` - - - -```python ->>> import mindspore.ops as ops ->>> ->>> avg_pool = ops.AvgPool(kernel_size=2, pad_mode='same') ->>> max_pool = ops.MaxPool(kernel_size=2, pad_mode='same') ->>> max_pool_with_argmax = ops.MaxPoolWithArgmax(kernel_size=2, pad_mode='same') -``` - -
- -##### `ops.TensorAdd`, change API name to `ops.Add` ([!11568](https://gitee.com/mindspore/mindspore/pulls/11568)) - -The operator name TensorAdd is not standardized, it is changed to Add. The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface. - - - - - - - - - -
1.1.0 1.1.1
- -```python ->>> import mindspore.ops as ops ->>> ->>> add = ops.TensorAdd() -``` - - - -```python ->>> import mindspore.ops as ops ->>> ->>> add = ops.Add() -``` - -
- -##### `ops.Gelu`, `ops.GeluGrad`, `ops.FastGelu`, `ops.FastGeluGrad`, change API name to `ops.GeLU`, `ops.GeLUGrad`, `ops.FastGeLU`, `ops.FastGeLUGrad` ([!11603](https://gitee.com/mindspore/mindspore/pulls/11603)) - -Gelu, GeluGrad, FastGelu, and FastGeluGrad names are unified into ReLU naming rules, "lu" is changed to the uppercase "LU". The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface. - - - - - - - - - -
1.1.0 1.1.1
- -```python ->>> import mindspore.ops as ops ->>> ->>> gelu = ops.Gelu() ->>> gelu_grad = ops.GeluGrad() ->>> fast_gelu = ops.FastGelu() ->>> fast_gelu_grad = ops.FastGeluGrad() -``` - - - -```python ->>> import mindspore.ops as ops ->>> ->>> gelu = ops.GeLU() ->>> gelu_grad = ops.GeLUGrad() ->>> fast_gelu = ops.FastGeLU() ->>> fast_gelu_grad = ops.FastGeLUGrad() -``` - -
- -##### `ops.GatherV2`, change API name to `ops.Gather` ([!11713](https://gitee.com/mindspore/mindspore/pulls/11713)) - -GatherV2 is changed to Gather. The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface. - - - - - - - - - -
1.1.0 1.1.1
- -```python ->>> import mindspore.ops as ops ->>> ->>> gather = ops.GatherV2() -``` - - - -```python ->>> import mindspore.ops as ops ->>> ->>> gather = ops.Gather() -``` - -
- -##### `ops.Pack`、`ops.Unpack`, change API name to `ops.Stack`、`ops.Unstack` ([!11828](https://gitee.com/mindspore/mindspore/pulls/11828)) - -Pack is changed to Stack, and Unpack is changed to Unstack. The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface. - - - - - - - - - -
1.1.0 1.1.1
- -```python ->>> import mindspore.ops as ops ->>> ->>> pack= ops.Pack() ->>> unpack= ops.Unpack() -``` - - - -```python ->>> import mindspore.ops as ops ->>> ->>> stack= ops.Stack() ->>> unstack= ops.Unstack() -``` - -
- -##### `ops.ControlDepend`, add deprecated to ControlDepend ([!11844](https://gitee.com/mindspore/mindspore/pulls/11844)) - -ControlDepend is deprecated and will be removed in a future version, use Depend instead. - - - - - - - - - -
1.1.0 1.1.1
- -```pythonNote: -Note: - This operation does not work in `PYNATIVE_MODE`. -``` - - - -```python -Note: - This operation does not work in `PYNATIVE_MODE`. - `ControlDepend` is deprecated from version 1.1 and will be removed in a future version, use `Depend` instead. -``` - -
- -##### `ops.Depend`, add operator description and use case ([!11815](https://gitee.com/mindspore/mindspore/pulls/11815)), ([!11879](https://gitee.com/mindspore/mindspore/pulls/11879)) - -Since the ControlDepend operator will be deprecated from version 1.2, it is recommended to use the Depend operator instead. - - - - - - - - - -
1.1.0 1.1.1
- -```python -Depend is used for processing side-effect operations. - -Inputs: - - **value** (Tensor) - the real value to return for depend operator. - - **expr** (Expression) - the expression to execute with no outputs. - -Outputs: - Tensor, the value passed by last operator. - -Supported Platforms: - ``Ascend`` ``GPU`` ``CPU`` -``` - - - -```python -Depend is used for processing dependency operations. - -In some side-effect scenarios, we need to ensure the execution order of operators. -In order to ensure that operator A is executed before operator B, it is recommended -to insert the Depend operator between operators A and B. - -Previously, the ControlDepend operator was used to control the execution order. -Since the ControlDepend operator will be deprecated from version 1.2, it is -recommended to use the Depend operator instead. The replacement method is as follows:: - - a = A(x) ---> a = A(x) - b = B(y) ---> y = Depend(y, a) - ControlDepend(a, b) ---> b = B(y) - -Inputs: - - **value** (Tensor) - the real value to return for depend operator. - - **expr** (Expression) - the expression to execute with no outputs. - -Outputs: - Tensor, the value passed by last operator. - -Supported Platforms: - ``Ascend`` ``GPU`` ``CPU`` - -Examples: - >>> import numpy as np - >>> import mindspore - >>> import mindspore.nn as nn - >>> import mindspore.ops.operations as P - >>> from mindspore import Tensor - >>> class Net(nn.Cell): - ... def __init__(self): - ... super(Net, self).__init__() - ... self.softmax = P.Softmax() - ... self.depend = P.Depend() - ... - ... def construct(self, x, y): - ... mul = x - y - ... y = self.depend(y, mul) - ... ret = self.softmax(y) - ... return ret - ... - >>> x = Tensor(np.ones([4, 5]), dtype=mindspore.float32) - >>> y = Tensor(np.ones([4, 5]), dtype=mindspore.float32) - >>> net = Net() - >>> output = net(x, y) - >>> print(output) - [[0.2 0.2 0.2 0.2 0.2] - [0.2 0.2 0.2 0.2 0.2] - [0.2 0.2 0.2 0.2 0.2] - [0.2 0.2 0.2 0.2 0.2]] -``` - -
- -#### C++ API - -##### change namespace from `mindspore::api` to `mindspore` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) - - - - - - - - - -
1.1.0 1.1.1
- -```c++ -namespace ms = mindspore::api; -``` - - - -```c++ -namespace ms = mindspore; -``` - -
- -##### `Context` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) - - - - - - - - - -
1.1.0 1.1.1
- -```c++ -ms::Context::Instance().SetDeviceTarget(ms::kDeviceTypeAscend310).SetDeviceID(0); -``` - - - -```c++ -ms::GlobalContext::SetGlobalDeviceTarget(ms::kDeviceTypeAscend310); -ms::GlobalContext::SetGlobalDeviceID(0); -``` - -
- -##### rename `Tensor` to `MSTensor` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) - - - - - - - - - -
1.1.0 1.1.1
- -```c++ -ms::Tensor a; -``` - - - -```c++ -ms::MSTensor a; -``` - -
- -##### `Model` move setting of model options from `Build` to ctor `Model` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) +
- + -
1.1.0 1.1.1 set_shape
-```c++ -ms::Model model(graph_cell); -model.Build(model_options); -``` - - - -```c++ -ms::Model model(graph_cell, model_context); -model.Build(); +```cpp +/// \brief Set the shape of MSTensor. +virtual void set_shape(const std::vector &shape) = 0; ```
-##### `Model` modify `GetInputsInfo`, `GetOutputsInfo` to `GetInputs`, `GetOutputs` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) +Previously, users could access to data of `MSTensor` by interface named `MutableData`. However `MutableData` is not only returning data of tensor but also allocating data for tensor if its data is nullptr. So we provide a new interfaces in ms_tensor.h named `data` for returning data of tensor without allocating automatically. - + + +
1.1.0 1.1.1 data
-```c++ -std::vector names; -std::vector types; -std::vector> shapes; -std::vector mem_sizes; -model.GetInputsInfo(&names, &types, &shapes, &mem_sizes); -std::cout << "Input 0 name: " << names[0] << std::endl; +```cpp +/// \brief Get the pointer of data in MSTensor. +/// +/// \note The data pointer can be used to both write and read data in MSTensor. No memory buffer will be +/// allocated. +/// +/// \return the pointer points to data in MSTensor. +virtual void *data() = 0; ```
+ +###### Delete `DimensionSize()` in ms_tensor.h.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) + +The interface named `DimensionSize` is fuinctionally overlapped with the interface named `shape`. For the simplicity of the interface, we delete `DimensionSize` and recommend users to use the new interface named `shape` instead. + + + + + +
DimensionSize()
-```c++ -auto inputs = model.GetInputs(); -std::cout << "Input 0 name: " << inputs[0].Name() << std::endl; +```cpp +/// \brief Get size of the dimension of the MindSpore Lite MSTensor index by the parameter index. +/// +/// \param[in] index Define index of dimension returned. +/// +/// \return Size of dimension of the MindSpore Lite MSTensor. +virtual int DimensionSize(size_t index) const = 0; ```
-##### `Model` modify `Predict` parameters type from `Buffer` to `MSTensor` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) +###### Move allocator from namespace mindspore::lite to namespace lite for unified runtime API.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515)) + +Previously, class `Allocator` is in namespace mindspore::lite. Considering unified allocator interface for unified runtime API, we move `Allocator` to namespace mindspore. - +
1.1.0 1.1.1 1.1.0 1.2.0
-```c++ -std::vector inputs; -std::vector outputs; -model.Predict(inputs, &outputs); +```cpp +namespace mindspore::lite { +/// \brief Allocator defined a memory pool for malloc memory and free memory dynamically. +/// +/// \note List public class and interface for reference. +class Allocator; +} ``` -```c++ -std::vector inputs; -std::vector outputs; -model.Predict(inputs, &outputs); +```cpp +namespace mindspore { +/// \brief Allocator defined a memory pool for malloc memory and free memory dynamically. +/// +/// \note List public class and interface for reference. +class Allocator; +} ```
-### Deprecations - -#### Python API - -##### `ops.SpaceToBatch`, `ops.BatchToSpace` are deprecated in favor of `ops.SpaceToBatchND`, `ops.BatchToSpaceND`([!11527](https://gitee.com/mindspore/mindspore/pulls/11527)) - -The `ops.SpaceToBatchND`, `ops.BatchToSpaceND` are more general and have same behavior as `ops.SpaceToBatch`, `ops.BatchToSpace` when `block_shape` is a int. - -##### `ops.DepthwiseConv2dNative` is deprecated in favor of `nn.Conv2D`([!11702](https://gitee.com/mindspore/mindspore/pulls/11702)) +### Bug fixes -The `ops.DepthwiseConv2dNative` is only supported by Ascend, it is recommended to directly use `nn.Conv2D`. If `group` is equal to `in_ channels` and `out_channels`, the 2D convolution layer is also a 2D depthwise convolution layer. +1. Fix the bug that the array in kernel registrar is not initialized. +2. Fix segment fault caused by releasing of OpParameter in Crop kernel in mistake. +3. Fix the bug that the MINDIR aware-training model is finally interpreted as weight-quant model. ## Contributors Thanks goes to these wonderful people: -Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, eric, Eric, fary86, fuzhiye, Gaoxiong, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jesse, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wukesong, wuweikang, wuxuejian, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, zymaa +Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, zymaa. Contributions of any kind are welcome! -# MindSpore 1.1.0 Release Notes +# MindSpore 1.1.1 Release Notes ## MindSpore @@ -6474,524 +749,477 @@ Contributions of any kind are welcome! #### NewModels -- [STABLE] GNMT v2: similar to the model described in Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, which is mainly used for corpus translation, on WMT Englis-German dataset.(Ascend) -- [STABLE] MaskRCNN: a conceptually simple, flexible, and general framework for object instance segmentation on COCO2017 dataset.(Ascend) -- [STABLE] YOLOv4: a state-of-the-art detector which is faster and more accurate than all available alternative detectors on MS COCO dataset.(Ascend) -- [STABLE] Openpose: proposes a bottom-up human attitude estimation algorithm using Part Affinity Fields on COCO2017 dataset.(Ascend) -- [STABLE] CNN-CTC: proposes three major contributions to addresses scene text recognition (STR) on MJSynth and SynthText dataset.(Ascend) -- [STABLE] CenterFace: a practical anchor-free face detection and alignment method for edge devices on WiderFace dataset.(Ascend) -- [STABLE] ShuffleNetV2: a much faster and more accurate network than the previous networks on ImageNet 2012 dataset.(GPU) -- [STABLE] EfficientNet-B0: a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient on ImageNet 2012 dataset.(GPU) -- [BETA] SSD-GhostNet: based on an Ghost module structure which generate more features from cheap operations on Oxford-IIIT Pet dataset.(Ascend) -- [BETA] DS-CNN: Depthwise separable convolutional neural network on Speech commands dataset.(Ascend) -- [BETA] DeepPotentialH2O: A neural network model for molecular dynamics simulations. (Ascend) -- [BETA] GOMO: A classical numerical method called GOMO for ocean simulation. (GPU) +- [STABLE] BGCF: a Bayesian Graph Collaborative Filtering(BGCF) framework used to model the uncertainty in the user-item interaction graph and thus recommend accurate and diverse items on Amazon recommendation dataset.(Ascend) +- [STABLE] GRU: a recurrent neural network architecture like the LSTM(Long-Short Term Memory) on Multi30K dataset.(Ascend) +- [STABLE] FastText: a simple and efficient text classification algorithm on AG's news topic classification dataset, DBPedia Ontology classification dataset and Yelp Review Polarity dataset.(Ascend) +- [STABLE] LSTM: a recurrent neural network architecture used to learn word vectors for sentiment analysis on aclImdb_v1 dataset.(Ascend) +- [STABLE] SimplePoseNet: a convolution-based neural network for the task of human pose estimation and tracking on COCO2017 dataset.(Ascend) #### FrontEnd -- [STABLE] Refactor the MINDIR to support 310 inference(Ascend). -- [STABLE] The execution backend of sparse operations in optimizer can be set through 'target'. (Ascend/GPU/CPU) -- [STABLE] Support saving specified network to checkpoint and filtering parameters according to prefix when load checkpoint. (Ascend/GPU/CPU) -- [STABLE] Allow users choose whether to load parameter into network strictly.(Ascend/GPU/CPU) -- [STABLE] Before training, in graph mode, in order to have the same network initialization parameter values ​​for all devices, broadcast the parameters on device 0 to other devices. (Ascend/GPU) -- [STABLE] Support if by if of control flow subgraph. (Ascend/GPU) -- [STABLE] Support the judgment that whether a tensor is in a list. (Ascend/GPU/CPU) -- [STABLE] Support to get a value by using the corresponding key in a dictionary in the network; Support to get keys and values of a dictionary in the network. (Ascend/GPU/CPU) -- [STABLE] Support Tensor in enumerate. (Ascend/GPU/CPU) -- [STABLE] Support multilevel index assignment. (Ascend/GPU/CPU) -- [STABLE] Support the 'expand_as','view','abs','mean' method of Tensor. (Ascend/GPU/CPU) -- [STABLE] Support ResizeBilinear operation transfer ratio. (Ascend) -- [STABLE] nn.Matmul supports matrix-vector product and batched matrix multiply. (Ascend/GPU) -- [STABLE] nn.Dense supports input tensor whose dimension can be greater than 2. (Ascend/GPU) -- [BETA] Support higher order differentiation for partial operators.(CPU/GPU/Ascend) -- [STABLE] Support Tensor Augassign.(Ascend/GPU) -- [BETA] Support 22 numpy native interfaces. - -#### Auto Parallel - -- [STABLE] Support parallel optimizer with weight shard. (Ascend/GPU) -- [STABLE] Support distributed operators: element-wise series, UnsortedSegmentSum, UnsortedSegmentMin, Split, BroadcastTo and Unique etc. (Ascend/GPU) -- [STABLE] Support distributed model prediction. (Ascend/GPU) -- [STABLE] Support auto mixed precision level "O2" in auto and semi auto parallel mode. (Ascend/GPU) -- [STABLE] Add MultiFieldEmbeddingLookup high-level interface. (Ascend/GPU) - -#### Executor - -- [STABLE] ResNet50 performance optimize. (GPU) -- [STABLE] Support modelzoo net in PyNative mode(Ascend 29, GPU 23, CPU 2).(Ascend/GPU/CPU) -- [STABLE] Support PyNative mode on CPU.(CPU) -- [STABLE] Optimize performance in PyNative mode.(Ascend/GPU/CPU) -- [STABLE] Support Safe Optimized Memory Allocation Solver (SOMAS) on Ascend to improve the memory-reuse, the batch size of Bert large model (128 sequence length) is increased from 160 to 208.(Ascend) -- [BETA] Support second order differentiation in PyNative mode.(Ascend/GPU) -- [DEMO] Add distributed trainning in PyNative mode.(Ascend/GPU) - -#### MDP - -- [STABLE] Add new operators for Ascend and GPU: IGamma, LGamma, DiGamma; -- [STABLE] Add new distributions for Ascend and GPU: LogNormal, and Logistic; -- [BETA] Add new distributions for Ascend only: Gumbel, Cauchy, Gamma, Beta, and Poisson; Add Categorical distribution for GPU; -- [STABLE] Add new bijectors for Ascend and GPU: GumbelCDF, Invert; -- [STABLE] Add Bayesian layer realized by local reparameterization method for Ascend and GPU; -- [STABLE] Add Anomaly Detection Toolbox based on VAE for Ascend and GPU. +- [BETA] Support Tensor Fancy Index Getitem with tuple and list. (Ascend/GPU/CPU) -#### DataSet +### Backwards Incompatible Change -- [STABLE] Support single node multi-p distributed cache data sharing -- [STABLE] Support GPU profiling with data processing -- [STABLE] Support YOLOV3 dynamic shape in sink mode with dataset -- [STABLE] Support unique processing in the data processing pipeline -- [STABLE] Python layer parameter verification error information unified +#### Python API -### API Change +##### `ops.AvgPool`, `ops.MaxPool`, `ops.MaxPoolWithArgmax` change attr name from 'ksize', 'padding' to 'kernel_size', 'pad_mode' ([!11350](https://gitee.com/mindspore/mindspore/pulls/11350)) -#### Backwards Incompatible Change +Previously the kernel size and pad mode attrs of pooling ops are named "ksize" and "padding", which is a little puzzling and inconsistent with convolution ops. So they are rename to "kernel_size" and "pad_mode". -##### Python API + + + + + + + + +
1.1.0 1.1.1
-###### Delete shape and dtype of class Initializer ([!7373](https://gitee.com/mindspore/mindspore/pulls/7373/files)) +```python +>>> import mindspore.ops as ops +>>> +>>> avg_pool = ops.AvgPool(ksize=2, padding='same') +>>> max_pool = ops.MaxPool(ksize=2, padding='same') +>>> max_pool_with_argmax = ops.MaxPoolWithArgmax(ksize=2, padding='same') +``` -Delete shape and dtype attributes of Initializer class. + -###### Modify the return type of initializer ([!7373](https://gitee.com/mindspore/mindspore/pulls/7373/files)) +```python +>>> import mindspore.ops as ops +>>> +>>> avg_pool = ops.AvgPool(kernel_size=2, pad_mode='same') +>>> max_pool = ops.MaxPool(kernel_size=2, pad_mode='same') +>>> max_pool_with_argmax = ops.MaxPoolWithArgmax(kernel_size=2, pad_mode='same') +``` -Previously, the return type of initializer function may be string, number, instance of class Tensor or subclass of class Initializer. +
-After modification, initializer function will return instance of class MetaTensor, class Tensor or subclass of class Initializer. +##### `ops.TensorAdd`, change API name to `ops.Add` ([!11568](https://gitee.com/mindspore/mindspore/pulls/11568)) -Noted that the MetaTensor is forbidden to initialize parameters, so we recommend that use str, number or subclass of Initializer for parameters initialization rather than the initializer functions. +The operator name TensorAdd is not standardized, it is changed to Add. The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface. - +
1.0.1 1.1.0 1.1.0 1.1.1
```python ->>> import mindspore.nn as nn ->>> from mindspore.common import initializer ->>> from mindspore import dtype as mstype +>>> import mindspore.ops as ops >>> ->>> def conv3x3(in_channels, out_channels) ->>> weight = initializer('XavierUniform', shape=(3, 2, 32, 32), dtype=mstype.float32) ->>> return nn.Conv2d(in_channels, out_channels, weight_init=weight, has_bias=False, pad_mode="same") +>>> add = ops.TensorAdd() ``` ```python ->>> import mindspore.nn as nn ->>> from mindspore.common.initializer import XavierUniform ->>> ->>> #1) using string ->>> def conv3x3(in_channels, out_channels) ->>> return nn.Conv2d(in_channels, out_channels, weight_init='XavierUniform', has_bias=False, pad_mode="same") +>>> import mindspore.ops as ops >>> ->>> #2) using subclass of class Initializer ->>> def conv3x3(in_channels, out_channels) ->>> return nn.Conv2d(in_channels, out_channels, weight_init=XavierUniform(), has_bias=False, pad_mode="same") +>>> add = ops.Add() ```
-Advantages: -After modification, we can use the same instance of Initializer to initialize parameters of different shapes, which was not allowed before. +##### `ops.Gelu`, `ops.GeluGrad`, `ops.FastGelu`, `ops.FastGeluGrad`, change API name to `ops.GeLU`, `ops.GeLUGrad`, `ops.FastGeLU`, `ops.FastGeLUGrad` ([!11603](https://gitee.com/mindspore/mindspore/pulls/11603)) + +Gelu, GeluGrad, FastGelu, and FastGeluGrad names are unified into ReLU naming rules, "lu" is changed to the uppercase "LU". The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface. - +
1.0.1 1.1.0 1.1.0 1.1.1
```python ->>> import mindspore.nn as nn ->>> from mindspore.common import initializer ->>> from mindspore.common.initializer import XavierUniform +>>> import mindspore.ops as ops >>> ->>> weight_init_1 = XavierUniform(gain=1.1) ->>> conv1 = nn.Conv2d(3, 6, weight_init=weight_init_1) ->>> weight_init_2 = XavierUniform(gain=1.1) ->>> conv2 = nn.Conv2d(6, 10, weight_init=weight_init_2) +>>> gelu = ops.Gelu() +>>> gelu_grad = ops.GeluGrad() +>>> fast_gelu = ops.FastGelu() +>>> fast_gelu_grad = ops.FastGeluGrad() ``` ```python ->>> import mindspore.nn as nn ->>> from mindspore.common import initializer ->>> from mindspore.common.initializer import XavierUniform +>>> import mindspore.ops as ops >>> ->>> weight_init = XavierUniform(gain=1.1) ->>> conv1 = nn.Conv2d(3, 6, weight_init=weight_init) ->>> conv2 = nn.Conv2d(6, 10, weight_init=weight_init) +>>> gelu = ops.GeLU() +>>> gelu_grad = ops.GeLUGrad() +>>> fast_gelu = ops.FastGeLU() +>>> fast_gelu_grad = ops.FastGeLUGrad() ```
-###### Modify get_seed function ([!7429](https://gitee.com/mindspore/mindspore/pulls/7429/files)) +##### `ops.GatherV2`, change API name to `ops.Gather` ([!11713](https://gitee.com/mindspore/mindspore/pulls/11713)) -Modify get_seed function implementation +GatherV2 is changed to Gather. The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface. -Previously, if seed is not set, the value of seed is default, parameters initialized by the normal function are the same every time. + + + + + + + + +
1.1.0 1.1.1
-After modification, if seed is not set, the value of seed is generated randomly, the initialized parameters change according to the random seed. +```python +>>> import mindspore.ops as ops +>>> +>>> gather = ops.GatherV2() +``` -If you want to fix the initial value of parameters, we suggest to set seed. + ```python ->>> from mindspore.common import set_seed ->>> set_seed(1) +>>> import mindspore.ops as ops +>>> +>>> gather = ops.Gather() ``` -###### `nn.LinSpace` ([!9494](https://gitee.com/mindspore/mindspore/pulls/9494)) has been removed and modify `ops.LinSpace` ([!8920](https://gitee.com/mindspore/mindspore/pulls/8920)) +
+ +##### `ops.Pack`、`ops.Unpack`, change API name to `ops.Stack`、`ops.Unstack` ([!11828](https://gitee.com/mindspore/mindspore/pulls/11828)) -The `nn.LinSpace` interface only support passing the value by args previously. For the convenience, we provided enhancive `ops.LinSpace` interface, which support passing the value by the inputs at the latest version. So there is no need for `nn.LinSpace`. +Pack is changed to Stack, and Unpack is changed to Unstack. The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface. - +
1.0.1 1.1.0 1.1.0 1.1.1
```python ->>> from mindspore import nn +>>> import mindspore.ops as ops >>> ->>> start = 1 ->>> stop = 10 ->>> num = 5 ->>> linspace = nn.LinSpace(start, stop, num) ->>> output = linspace() +>>> pack= ops.Pack() +>>> unpack= ops.Unpack() ``` ```python ->>> import mindspore ->>> from mindspore import Tensor ->>> from mindspore import ops +>>> import mindspore.ops as ops >>> ->>> linspace = ops.LinSpace() ->>> start = Tensor(1, mindspore.float32) ->>> stop = Tensor(10, mindspore.float32) ->>> num = 5 ->>> output = linspace(start, stop, num) +>>> stack= ops.Stack() +>>> unstack= ops.Unstack() ```
-###### Parts of `Optimizer` add target interface ([!6760](https://gitee.com/mindspore/mindspore/pulls/6760/files)) - -The usage of the sparse optimizer is changed. - -The target interface is used to set the execution backend of the sparse operator. - -The add_primitive_attr interface is no longer allowed. +##### `ops.ControlDepend`, add deprecated to ControlDepend ([!11844](https://gitee.com/mindspore/mindspore/pulls/11844)) -The following optimizers add the target interface: Adam, FTRL, LazyAdam, ProximalAdagrad +ControlDepend is deprecated and will be removed in a future version, use Depend instead. - +
1.0.1 1.1.0 1.1.0 1.1.1
-```python ->>> from mindspore.nn import Adam ->>> ->>> net = LeNet5() ->>> optimizer = Adam(filter(lambda x: x.requires_grad, net.get_parameters())) ->>> optimizer.sparse_opt.set_device("CPU") +```pythonNote: +Note: + This operation does not work in `PYNATIVE_MODE`. ``` ```python ->>> from mindspore.nn import Adam ->>> ->>> net = LeNet5() ->>> optimizer = Adam(filter(lambda x: x.requires_grad, net.get_parameters())) ->>> optimizer.target = 'CPU' +Note: + This operation does not work in `PYNATIVE_MODE`. + `ControlDepend` is deprecated from version 1.1 and will be removed in a future version, use `Depend` instead. ```
-###### `export` Modify the input parameters and export's file name ([!7385](https://gitee.com/mindspore/mindspore/pulls/7385), [!9057](https://gitee.com/mindspore/mindspore/pulls/9057/files)) - -Export the MindSpore prediction model to a file in the specified format. - -The reference includes: `net`, `*inputs`, `file_name`, `file_format`, `**kwargs`. - -Input parameters can be input according to specific export requirements. +##### `ops.Depend`, add operator description and use case ([!11815](https://gitee.com/mindspore/mindspore/pulls/11815)), ([!11879](https://gitee.com/mindspore/mindspore/pulls/11879)) -Add the file name extension based on the format. +Since the ControlDepend operator will be deprecated from version 1.2, it is recommended to use the Depend operator instead. - +
1.0.1 1.1.0 1.1.0 1.1.1
```python ->>> from mindspore.train.quant import quant ->>> ->>> network = LeNetQuant() ->>> inputs = Tensor(np.ones([1, 1, 32, 32]), mindspore.float32) ->>> quant.export(network, inputs, file_name="lenet_quant.mindir", file_format='MINDIR') -lenet_quant.mindir +Depend is used for processing side-effect operations. + +Inputs: + - **value** (Tensor) - the real value to return for depend operator. + - **expr** (Expression) - the expression to execute with no outputs. + +Outputs: + Tensor, the value passed by last operator. + +Supported Platforms: + ``Ascend`` ``GPU`` ``CPU`` ``` ```python ->>> import mindspore as ms ->>> ->>> network = LeNetQuant() ->>> inputs = Tensor(np.ones([1, 1, 32, 32]), mindspore.float32) ->>> ms.export(network, inputs, file_name="lenet_quant", file_format='MINDIR', quant_mode='AUTO') -lenet_quant.mindir +Depend is used for processing dependency operations. + +In some side-effect scenarios, we need to ensure the execution order of operators. +In order to ensure that operator A is executed before operator B, it is recommended +to insert the Depend operator between operators A and B. + +Previously, the ControlDepend operator was used to control the execution order. +Since the ControlDepend operator will be deprecated from version 1.2, it is +recommended to use the Depend operator instead. The replacement method is as follows:: + + a = A(x) ---> a = A(x) + b = B(y) ---> y = Depend(y, a) + ControlDepend(a, b) ---> b = B(y) + +Inputs: + - **value** (Tensor) - the real value to return for depend operator. + - **expr** (Expression) - the expression to execute with no outputs. + +Outputs: + Tensor, the value passed by last operator. + +Supported Platforms: + ``Ascend`` ``GPU`` ``CPU`` + +Examples: + >>> import numpy as np + >>> import mindspore + >>> import mindspore.nn as nn + >>> import mindspore.ops.operations as P + >>> from mindspore import Tensor + >>> class Net(nn.Cell): + ... def __init__(self): + ... super(Net, self).__init__() + ... self.softmax = P.Softmax() + ... self.depend = P.Depend() + ... + ... def construct(self, x, y): + ... mul = x - y + ... y = self.depend(y, mul) + ... ret = self.softmax(y) + ... return ret + ... + >>> x = Tensor(np.ones([4, 5]), dtype=mindspore.float32) + >>> y = Tensor(np.ones([4, 5]), dtype=mindspore.float32) + >>> net = Net() + >>> output = net(x, y) + >>> print(output) + [[0.2 0.2 0.2 0.2 0.2] + [0.2 0.2 0.2 0.2 0.2] + [0.2 0.2 0.2 0.2 0.2] + [0.2 0.2 0.2 0.2 0.2]] ```
-###### `Dense`, `Conv2dBnAct`, `DenseBnAct`, `DenseQuant` support setting the activation attribute as an instance of a class derived from `nn.Cell` or `Primtive` ([!7581](https://gitee.com/mindspore/mindspore/pulls/7581)) +#### C++ API -activation (Union[str, Cell, Primitive]): activate function applied to the output of the fully connected layer +##### change namespace from `mindspore::api` to `mindspore` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) - +
1.0.1 1.1.0 1.1.0 1.1.1
-```python ->>> import mindspore.nn as nn ->>> ->>> dense = nn.Dense(1, 1, activation='relu') +```c++ +namespace ms = mindspore::api; ``` -```python ->>> import mindspore.nn as nn ->>> import mindspore.ops as ops ->>> ->>> dense = nn.Dense(1, 1, activation=nn.ReLU()) ->>> dense = nn.Dense(1, 1, activation=ops.ReLU()) +```c++ +namespace ms = mindspore; ```
-###### `tensor.dim()`, `tensor.size()` has been renamed to `tensor.ndim`, `tensor.size` ([!10175](https://gitee.com/mindspore/mindspore/pulls/10175)) - -Previously, tensor.size() and tensor.dim() were used for checking the total number of elements/dimensions in the tensor. -However, from a user's perspective, tensor.size and tensor.ndim (methods -> properties) are better choices, since they follow the numpy naming convention. +##### `Context` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) - +
1.0.1 1.1.0 1.1.0 1.1.1
-```python ->>> from mindspore import Tensor ->>> ->>> Tensor((1,2,3)).size() ->>> Tensor((1,2,3)).dim() +```c++ +ms::Context::Instance().SetDeviceTarget(ms::kDeviceTypeAscend310).SetDeviceID(0); ``` -```python ->>> from mindspore import Tensor ->>> ->>> Tensor((1,2,3)).size ->>> Tensor((1,2,3)).ndim +```c++ +ms::GlobalContext::SetGlobalDeviceTarget(ms::kDeviceTypeAscend310); +ms::GlobalContext::SetGlobalDeviceID(0); ```
-###### `EmbeddingLookup` add a config in the interface: sparse ([!8202](https://gitee.com/mindspore/mindspore/pulls/8202)) - -sparse (bool): Using sparse mode. When 'target' is set to 'CPU', 'sparse' has to be true. Default: True. +##### rename `Tensor` to `MSTensor` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) - +
1.0.1 1.1.0 1.1.0 1.1.1
-```python ->>> from mindspore.nn import EmbeddingLookup ->>> ->>> input_indices = Tensor(np.array([[1, 0], [3, 2]]), mindspore.int32) ->>> result = EmbeddingLookup(4,2)(input_indices) ->>> print(result.shape) -(2, 2, 2) +```c++ +ms::Tensor a; ``` -```python ->>> from mindspore.nn import EmbeddingLookup ->>> ->>> input_indices = Tensor(np.array([[1, 0], [3, 2]]), mindspore.int32) ->>> result = EmbeddingLookup(4,2)(input_indices, sparse=False) ->>> print(result.shape) -(2, 2, 2) +```c++ +ms::MSTensor a; ```
-###### `nn.probability.bijector` change types of attributes from (int, float) to (float, list, numpy.ndarray, Tensor) ([!8191](https://gitee.com/mindspore/mindspore/pulls/8191)) - -Attributes Type change: (int, float) -> (float, list, numpy.ndarray, Tensor). -Int type is not supported anymore. Parameters of all bijectors should be type float, list, numpy.ndarray or Tensor. +##### `Model` move setting of model options from `Build` to ctor `Model` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) - +
1.0.1 1.1.0 1.1.0 1.1.1
-```python ->>> import mindspore.nn.probability.bijector as msb ->>> ->>> power = 2 ->>> bijector = msb.PowerTransform(power=power) +```c++ +ms::Model model(graph_cell); +model.Build(model_options); ``` -```python ->>> import mindspore.nn.probability.bijector as msb ->>> ->>> power = 2.0 ->>> bijector = msb.PowerTransform(power=power) +```c++ +ms::Model model(graph_cell, model_context); +model.Build(); ```
-###### `nn.probability.bijector.GumbelCDF` remove a attribute in the interface: dtype ([!8191](https://gitee.com/mindspore/mindspore/pulls/8191)) - -dtype is removed from GumbelCDF and is no longer an argument of the class. +##### `Model` modify `GetInputsInfo`, `GetOutputsInfo` to `GetInputs`, `GetOutputs` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) - +
1.0.1 1.1.0 1.1.0 1.1.1
-```python ->>> import mindspore.nn.probability.bijector as msb ->>> from mindspore import dtype as mstype ->>> ->>> bijector = msb.GumbelCDF(loc=0.0, scale=1.0, dtype=mstype.float32) +```c++ +std::vector names; +std::vector types; +std::vector> shapes; +std::vector mem_sizes; +model.GetInputsInfo(&names, &types, &shapes, &mem_sizes); +std::cout << "Input 0 name: " << names[0] << std::endl; ``` -```python ->>> import mindspore.nn.probability.bijector as msb ->>> ->>> bijector = msb.GumbelCDF(loc=0.0, scale=1.0) +```c++ +auto inputs = model.GetInputs(); +std::cout << "Input 0 name: " << inputs[0].Name() << std::endl; ```
-###### `nn.layer.combined.Conv2dBnAct`, `nn.layer.combined.DenseBnAct` move from nn.layer.quant to nn.layer.combined ([!8187](https://gitee.com/mindspore/mindspore/pulls/8187)) - -Previously Conv2dBnAct and DenseBnAct are in nn.layer.quant, since they are not quant cells, now they are moved to nn.layer.combined. If you import Conv2dBnAct, DenseBnAct from mindspore.nn, then your code doesn't need any change. +##### `Model` modify `Predict` parameters type from `Buffer` to `MSTensor` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574)) - +
1.0.1 1.1.0 1.1.0 1.1.1
-```python ->>> from mindspore.nn.layer.quant import Conv2dBnAct, DenseBnAct +```c++ +std::vector inputs; +std::vector outputs; +model.Predict(inputs, &outputs); ``` -```python ->>> from mindspore.nn import Conv2dBnAct, DenseBnAct +```c++ +std::vector inputs; +std::vector outputs; +model.Predict(inputs, &outputs); ```
-###### `nn.layer.conv.Conv2D`, `nn.layer.quant.Conv2dBnFoldQuant`, `nn.layer.quant.Conv2dBnWithoutFoldQuant` change weight shape when group > 1 in Ascend platform ([!9723](https://gitee.com/mindspore/mindspore/pulls/9723)) - -In Ascend platform, if group > 1, the weight shape of Conv2D change from [in_channels//group, out_channels, kernel_size, kernel_size] to [out_channels, in_channels//group, kernel_size, kernel_size]. Previously, checkpoints of the networks are used, which use Conv2D with group > 1, such as MobileNet, can not be directly used now, need to transpose the first and second axis of the weight. +### Deprecations -### Bug fixes +#### Python API -#### FrontEnd +##### `ops.SpaceToBatch`, `ops.BatchToSpace` are deprecated in favor of `ops.SpaceToBatchND`, `ops.BatchToSpaceND`([!11527](https://gitee.com/mindspore/mindspore/pulls/11527)) -- [STABLE] Fix the problem of the cse optimization in the situation of control flow. (Ascend/GPU) +The `ops.SpaceToBatchND`, `ops.BatchToSpaceND` are more general and have same behavior as `ops.SpaceToBatch`, `ops.BatchToSpace` when `block_shape` is a int. -#### Auto Parallel +##### `ops.DepthwiseConv2dNative` is deprecated in favor of `nn.Conv2D`([!11702](https://gitee.com/mindspore/mindspore/pulls/11702)) -- [STABLE] Resolve the restriction: input and output layouts of Reshape are restricted in tensor redistribution. (Ascend/GPU) -- [STABLE] Resolve the restriction: output strategy should be data parallel in model evaluation. (Ascend/GPU) +The `ops.DepthwiseConv2dNative` is only supported by Ascend, it is recommended to directly use `nn.Conv2D`. If `group` is equal to `in_ channels` and `out_channels`, the 2D convolution layer is also a 2D depthwise convolution layer. -#### Executor +## Contributors -- [STABLE] Fix fusion operator compilation cache. (Ascend) -- [STABLE] Fix compilation error of dynamic shape operator. (Ascend) -- [STABLE] Fix bug of pynative cannot insert transdata of node output when node should be spilted in the backend opt.(Ascend) -- [STABLE] Fix the bug of TensorMove and memcpy_async merge to one after backend cse pass (Ascend) +Thanks goes to these wonderful people: -#### DataSet +Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, eric, Eric, fary86, fuzhiye, Gaoxiong, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jesse, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39@huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wukesong, wuweikang, wuxuejian, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5@huawei.com, zhanghuiyao, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, zymaa -- [STABLE] Fix cache server hang on RequestFreeTag. (Ascend/GPU/CPU) -- [STABLE] Fix hung when use pyfunc multi-processing. (Ascend/GPU/CPU) -- [STABLE] Fix add multiple parent nodes to tree node cause core dump. (Ascend/GPU/CPU) +Contributions of any kind are welcome! -## MindSpore Lite +## MindSpore Lite 1.1.0 Release Notes ### Major Features and Improvements @@ -7095,73 +1323,7 @@ Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, Contributions of any kind are welcome! -# MindSpore 1.0.0 Release Notes - -## Major Features and Improvements - -### MindSpore Training and Inference Framework - -#### Ascend - -- New models - - DenseNet121: a dense convolutional neural network, which connects each layer to every other layer in a feed-forward fashion for object recognition on ImageNet dataset. - - UNet2D-Medical: Unet Medical model for 2D image segmentation, Convolutional Networks for Biomedical Image Segmentation on ISBI Challenge database. -- Frontend and user interface - - Second-Order Optimization - - Enable second-order optimization for Bert on Ascend, which can achieve a masked lm accuracy of 71.3% in 800 seconds using 8 Ascend (Bert-Large @MLPerf v0.7 dataset). - - New GNN model BGCF - - Bayesian Graph Convolutional Filtering network which naturally incorporate the uncertainty in the user-item interaction graph shows excellent recommendation performance on Amazon-Beauty dataset. - - Add append interface for SequentialCell. - - Add a level `auto` for AMP. -- Executor and performance optimization - - Support quantitative network (Resnet50 & YoloV3 & MobileNetV2). - - Project ease of use optimization: project compilation time optimization, CMakelist regularization, cudnn, cuda independent compilation and installation independent. -- Data processing, augmentation, and save format - - Support GeneratorDataset return string type - -#### Other Hardware Support - -- GPU platform - - Enable second-order optimization for resnet50 on GPU, which achieve 30% improvement on training time compared to SGD with Momentum (Resnet50 @ImageNet). - -#### User interfaces change log - -- Remove global object GradOperation in Autodiff([!5011](https://gitee.com/mindspore/mindspore/pulls/5011)) -- Remove useless attribute 'name' in Autodiff([!5172](https://gitee.com/mindspore/mindspore/pulls/5172)) -- Rectification distributed init([!5350](https://gitee.com/mindspore/mindspore/pulls/5350)) -- Move the setting of ParalleMode from train.parallel_utils to context([!5351](https://gitee.com/mindspore/mindspore/pulls/5351)) -- Modification of save_checkpoint([!5482](https://gitee.com/mindspore/mindspore/pulls/5482)) -- Wrap numpy random seed into an api([!5634](https://gitee.com/mindspore/mindspore/pulls/5634)) -- Delete enable_fused_layernorm in some modelzoo scripts([!5665](https://gitee.com/mindspore/mindspore/pulls/5665)) -- Move 'multi-subgraphs' interface to internal([!5696](https://gitee.com/mindspore/mindspore/pulls/5696)) -- Rename mirror_mean to gradient_mean([!5700](https://gitee.com/mindspore/mindspore/pulls/5700)) -- Remove default value of 'group' of DepthWiseConv2d([!5865](https://gitee.com/mindspore/mindspore/pulls/5865)) -- Modify interface for function and remove duplicated def([!5958](https://gitee.com/mindspore/mindspore/pulls/5958)) -- Unify Conv2d and DepthwiseConv2d([!5916](https://gitee.com/mindspore/mindspore/pulls/5916)) -- Modification of SoftmaxCrossEntropyWithLogits([!5502](https://gitee.com/mindspore/mindspore/pulls/5502)) -- Change API set_strategy() to shard()([!5991](https://gitee.com/mindspore/mindspore/pulls/5991)) -- Move batch_size from bert_cfg_cfg to cfg([!6233](https://gitee.com/mindspore/mindspore/pulls/6233)) -- Remove unused parameters from SummaryRecord __init__([!5548](https://gitee.com/mindspore/mindspore/pulls/5548)) -- remove sens parameter of TrainOneStepWithLossScaleCell([!5753](https://gitee.com/mindspore/mindspore/pulls/5753)) -- optimize the TrainOneStepCell for user's define([!6159](https://gitee.com/mindspore/mindspore/pulls/6159)) -- delete seed0 and seed1 of nn.Dropout([!5735](https://gitee.com/mindspore/mindspore/pulls/5735)) -- delete DataWrapper([!6101](https://gitee.com/mindspore/mindspore/pulls/6101)) -- LSTM API optimization([!6374](https://gitee.com/mindspore/mindspore/pulls/6374)) -- Merge P\C\F of ops([!5645](https://gitee.com/mindspore/mindspore/pulls/5645)) -- delete SoftmaxCrossEntropyExpand interface([!6607](https://gitee.com/mindspore/mindspore/pulls/6607)) -- Adjust GroupNorm interface([!6329](https://gitee.com/mindspore/mindspore/pulls/6329)) -- Modify init interface to internal interface([!6651](https://gitee.com/mindspore/mindspore/pulls/6651)) -- Log optimization([!5842](https://gitee.com/mindspore/mindspore/pulls/5842)) -- Remove useless API dataset.set_dataset_size([!5806](https://gitee.com/mindspore/mindspore/pulls/5806)) -- Some of Dataset API add usage parameter([!5605](https://gitee.com/mindspore/mindspore/pulls/5605)) -- Change the import path, such as from mindspore.dataset.transforms.vision to mindspore.dataset.vision.transforms([!5384](https://gitee.com/mindspore/mindspore/pulls/5384)) -- Rename ImageFolderDatasetV2 to ImageFolderDataset([!5384](https://gitee.com/mindspore/mindspore/pulls/5384)) -- Dataset.map parameter optimization([!5384](https://gitee.com/mindspore/mindspore/pulls/5384)) -- Add new api dataset.get_col_names([!5384](https://gitee.com/mindspore/mindspore/pulls/5384)) -- Add new api dataset.get_col_names([!5384](https://gitee.com/mindspore/mindspore/pulls/5384)) -- Remove useless API MindRecord finish([!5580](https://gitee.com/mindspore/mindspore/pulls/5580)) - -### MindSpore Lite +## MindSpore Lite 1.0.0 Release Notes - Converter - Add 6 TFLite op, 7 Caffe op, 1 ONNX op. @@ -7214,63 +1376,7 @@ Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, Contributions of any kind are welcome! -# MindSpore 0.7.0-beta Release Notes - -## Major Features and Improvements - -### MindSpore Training and Inference Framework - -#### Ascend - -- New models - - TinyBert: a smaller and faster version of BERT using transformer distillation for natural language understanding on GLUE benchmark. - - SE-ResNet50: add Squeeze-and-Excitation blocks(SE-Blocks) to the resnet50 network to improve channel interdependencies for image classification on ImageNet 2012 dataset. - - Inception V3: the third version of Inception convolutional architectures for image classification on ImageNet 2012 dataset. -- Frontend and user interface - - Embedding operator high-level packaging to support segmented by field for Wide&Deep. - - Load multi-node checkpoint into single-process to support host-device hybrid inference. - - Support Concat/Tile/Strideslice distributed operators. - - Support cumulative gradient and batch training split. - - Support variable parameter input for Cell object. - - Parameter mixed calculation optimization for pynative mode. - - Deep Probabilistic Programming - - Support statistical distributions classes used to generate stochastic tensors. - - Support probabilistic inference algorithms. - - Support BNN layers used to construct BNN in Graph mode. - - Support interfaces for the transformation between BNN and DNN in Graph mode. - - Support uncertainty estimation to estimate epistemic uncertainty and aleatoric uncertainty. - - User interfaces change log - - change base class of parameter([!3473](https://gitee.com/mindspore/mindspore/pulls/3473)) - - change binary to mindir([!4258](https://gitee.com/mindspore/mindspore/pulls/4258)) - - change export from geir to air([!4269](https://gitee.com/mindspore/mindspore/pulls/4269)) - - Init parameter data by default([!3967](https://gitee.com/mindspore/mindspore/pulls/3967)) - - change IndexedSlices to RowTensor([!4031](https://gitee.com/mindspore/mindspore/pulls/4031)) - - Must set or change parallel mode before any Initializer created([!4801](https://gitee.com/mindspore/mindspore/pulls/4801)) -- Executor and performance optimization - - MindSpore graph compilation process performance improved by 20%. - - Decoupling C++ and Python modules to achieve separate compilation of core modules. -- Data processing, augmentation, and save format - - Support automatic data augmentation - - Support GNN distributed cache in single node - - Support ConcatDataset using distributed sampler - -#### Other Hardware Support - -- GPU platform - - New model supported: VGG16, ResNet101, DeepFM. - - Support some distributed operators in ResNet50 and Wide&Deep. - - Support automatic parallel for Wide&Deep. - - Support function funcs[i](*inputs) (such as switch-case). - - Support distributed training with parameter server. - - Support GPU operator profiling. - - Performance optimization of the distributed training with allreduce. - - Performance optimization of the mixed precision training. - - Performance optimization of the pynative mode. - - Performance optimization of the convolution operator, batch normalization operator. -- CPU platform - - Support MobileNetV2 Re-Training: Re-train the network with different class number. - -### MindSpore Lite +## MindSpore Lite 0.7.0-beta Release Notes - Converter - Support third-party models, including TFLite/Caffe/ONNX. @@ -7409,350 +1515,3 @@ Thanks goes to these wonderful people: Alexey Shevlyakov, avakh, baihuawei, BowenK, buxue, caifubi, caojian05, Cathy Wong, changzherui, chenfei, chengxianbin, chenhaozhe, chenjianping, chentingting, chenzomi, chujinjin, Danish Farid, dayschan, dengwentao, dinghao, etone-chan, fangzehua, fary86, geekun, Giancarlo Colmenares, gong chen, gukecai, guohongzilong, hangangqiang, heleiwang, hesham, He Wei, hexia, hongxing, huangdongrun, huanghui, islam_amin, Jamie Nisbet, Jesse Lee, jiangjinsheng, jiangzhiwen, jinyaohui, jjfeing, jojobugfree, Jonathan Yan, jonyguo, Junhan Hu, Kang, kingfo, kouzhenzhong, kpy, kswang, laiyongqiang, leopz, liangzelang, lichenever, lihongkang, Li Hongzhang, lilei, limingqi107, lirongzhen1, liubuyu, liuchongming74, liuwenhao4, liuxiao, Lixia Chen, liyanliu, liyong, lizhenyu, lvliang, Mahdi, Margaret_wangrui, meixiaowei, ms_yan, nhussain, ougongchang, panfengfeng, panyifeng, peilinwang, Peilin Wang, pkuliuliu, qianlong, rick_sanchez, shibeiji, Shida He, shijianning, simson, sunsuodong, suteng, Tinazhang, Tron Zhang, unknown, VectorSL, wandongdong, wangcong, wangdongxu, wangdongxu6, wanghua, wangnan39, Wei Luning, wenchunjiang, wenkai, wilfChen, WilliamLian, wukesong, Xian Weizhao, Xiaoda Zhang, xiefangqi, xulei2020, xunxue, xutianchun, Yang, yanghaitao, yanghaitao1, yanghaoran, yangjie, yangjie159, YangLuo, Yanjun Peng, yankai, yanzhenxiang2020, yao_yf, Yi Huaijie, yoonlee666, yuchaojie, yujianfeng, zhangzhongpeng, zhangdengcheng, Zhang Qinghua, zhangyinxia, zhangz0911gm, zhaojichen, zhaoting, zhaozhenlong, zhoufeng, zhouneng, zhousiyi, Zirui Wu, Ziyan, zjun, ZPaC, lihongzhang, wangdongxu Contributions of any kind are welcome! - -# MindSpore 0.5.2-beta Release Notes - -## Major Features and Improvements - -### Ascend Training and Inference Framework - -- New models - - DenseNet121: a convolution based neural network for the task of image classification on ImageNet 2012 dataset. - -## Bugfixes - -- Models - - VGG16,Alexnet,GoogleNet,optimize network for better performance. ([!5539](https://gitee.com/mindspore/mindspore/pulls/5539)) - - YOLOV3, fix yolov3_darknet53 dataset bug. ([!5658](https://gitee.com/mindspore/mindspore/pulls/5658)) - -## Contributors - -Thanks goes to these wonderful people: - -Alexey Shevlyakov, avakh, baihuawei, BowenK, buxue, caifubi, caojian05, Cathy Wong, changzherui, chenfei, chengxianbin, chenhaozhe, chenjianping, chentingting, chenzomi, chujinjin, Danish Farid, dayschan, dengwentao, dinghao, etone-chan, fangzehua, fary86, geekun, Giancarlo Colmenares, gong chen, gukecai, guohongzilong, hangangqiang, heleiwang, hesham, He Wei, hexia, hongxing, huangdongrun, huanghui, islam_amin, Jamie Nisbet, Jesse Lee, jiangjinsheng, jiangzhiwen, jinyaohui, jjfeing, jojobugfree, Jonathan Yan, jonyguo, Junhan Hu, Kang, kingfo, kouzhenzhong, kpy, kswang, laiyongqiang, leopz, liangzelang, lichenever, lihongkang, Li Hongzhang, lilei, limingqi107, lirongzhen1, liubuyu, liuchongming74, liuwenhao4, liuxiao, Lixia Chen, liyanliu, liyong, lizhenyu, lvliang, Mahdi, Margaret_wangrui, meixiaowei, ms_yan, nhussain, ougongchang, panfengfeng, panyifeng, peilinwang, Peilin Wang, pkuliuliu, qianlong, rick_sanchez, shibeiji, Shida He, shijianning, simson, sunsuodong, suteng, Tinazhang, Tron Zhang, unknown, VectorSL, wandongdong, wangcong, wangdongxu, wangdongxu6, wanghua, wangnan39, Wei Luning, wenchunjiang, wenkai, wilfChen, WilliamLian, wukesong, Xian Weizhao, Xiaoda Zhang, xiefangqi, xulei2020, xunxue, xutianchun, Yang, yanghaitao, yanghaitao1, yanghaoran, yangjie, yangjie159, YangLuo, Yanjun Peng, yankai, yanzhenxiang2020, yao_yf, Yi Huaijie, yoonlee666, yuchaojie, yujianfeng, zhangzhongpeng, zhangdengcheng, Zhang Qinghua, zhangyinxia, zhangz0911gm, zhaojichen, zhaoting, zhaozhenlong, zhoufeng, zhouneng, zhousiyi, Zirui Wu, Ziyan, zjun, ZPaC, lihongzhang, wangdongxu - -Contributions of any kind are welcome! - -# MindSpore 0.5.0-beta Release Notes - -## Major Features and Improvements - -### Ascend Training and Inference Framework - -- New models - - ResNext50: a simple, highly modularized network architecture using aggregated resdiual transformations for image classification on ImageNet 2012 dataset. - - MASS: a pre-training method for sequence to sequence based language generation tasks on Text Summarization and Conversational Response Generation using News Crawls 2007-2017 dataset, Gigaword corpus and Cornell movie dialog corpus. - - Transformer: a neural network architecture for language understanding on WMT 2014 English-German dataset. - - GCN: Graph Convolutional Networks for the task of classification of nodes in a graph on Cora and Citeseer datasets. - - GAT: an attention-based graph neural network for node classification on Cora and CiteSeer dataset. -- Frontend and user interface - - Support tensor value and assignment of mixed tensor index in graph mode. - - Support tensor comparison, len operator, constexpr syntax, value and assignment of tensor index in pynative mode. - - Support converting MindSpore IR to pb format for infer model. - - Support print operator to write data directly on the hard disk. - - Add the double recursive programming solution for very high speed parallel strategy search in automatic parallel. - - User interfaces change log - - Allow the learning rate of AdamWeightDecayDynamicLR and Lamb to be 0([!1826](https://gitee.com/mindspore/mindspore/pulls/1826)) - - Restricting the entire network input parameter is Tensor([!1967](https://gitee.com/mindspore/mindspore/pulls/1967)) - - Turn shape and dtype into attributes instead of interfaces([!1919](https://gitee.com/mindspore/mindspore/pulls/1919)) - - Delete multitypefungraph([!2116](https://gitee.com/mindspore/mindspore/pulls/2116)) - - Refactor the callback module in an encapsulated way, use _CallbackManager instead of_build_callbacks([!2236](https://gitee.com/mindspore/mindspore/pulls/2236)) - - Delete EmbeddingLookup([!2163](https://gitee.com/mindspore/mindspore/pulls/2163)) - - Checkpoint add model_type([!2517](https://gitee.com/mindspore/mindspore/pulls/2517)) -- Executor and performance optimization - - Heterogeneous execution on CPU and Ascend devices supported, and is verified in Wide&Deep model. - - Quantitative training of MobileNetV2, Lenet and Resnet50 on Ascend are supported. - - Support new fusion architecture, which can do fusion optimization across graphs and kernels to improve execution speed. -- Data processing, augmentation, and save format - - Support data processing pipeline performance profiling. - - Support public dataset loading, such as CLUE and Coco. - - Support more text processing, such as more tokenizers and vocab data. - - Support MindRecord padded data. - -### Other Hardware Support - -- GPU platform - - New model supported: Bert / Wide&Deep. - - Support setting max device memory. -- CPU platform - - New model supported: LSTM. - -## Bugfixes - -- Models - - Bert, Move Bert from `example` to `model_zoo`, optimize network for better performance. ([!1902](https://gitee.com/mindspore/mindspore/pulls/1902)) - - VGG16, Move VGG16 from `example` to `model_zoo`, optimize network for better accuracy. ([!2645](https://gitee.com/mindspore/mindspore/pulls/2645)) - - Alexnet, modify parameter setting to improve accuracy ([!1364](https://gitee.com/mindspore/mindspore/pulls/2370)) - - Wide&Deep, Move Wide&Deep from `example` to `model_zoo`, optimize network for better performance. ([!2221](https://gitee.com/mindspore/mindspore/pulls/2221)) -- Python API - - Fix bug in auto cast([!1766](https://gitee.com/mindspore/mindspore/pulls/1766)) - - Fix bug of register_backward_hook([!2148](https://gitee.com/mindspore/mindspore/pulls/2148)) - - Fix bug of tuple args in pynative mode([!1878](https://gitee.com/mindspore/mindspore/pulls/1878)) - - Fix bug of checking numbers of arguments and graph parameters([!1701](https://gitee.com/mindspore/mindspore/pulls/1701)) -- Executor - - Fix bug of loading input data repeatedly in pynative mode([!1966](https://gitee.com/mindspore/mindspore/pulls/1966)) - - Fix bug of list cannot be used as input in pynative mode([!1765](https://gitee.com/mindspore/mindspore/pulls/1765)) - - Fix bug of kernel select ([!2103](https://gitee.com/mindspore/mindspore/pulls/2103)) - - Fix bug of pattern matching for batchnorm fusion in the case of auto mix precision.([!1851](https://gitee.com/mindspore/mindspore/pulls/1851)) - - Fix bug of generate hccl's kernel info.([!2393](https://gitee.com/mindspore/mindspore/pulls/2393)) -- GPU platform - - Fix bug of summary feature invalid([!2173](https://gitee.com/mindspore/mindspore/pulls/2173)) -- Data processing - - Fix bug of Cifar dataset reading([!2096](https://gitee.com/mindspore/mindspore/pulls/2096)) - - Fix bug of C++ behavior in RandomCropAndResize([!2026](https://gitee.com/mindspore/mindspore/pulls/2026)) - - Fix the bug of mindrecord shuffle([!2420](https://gitee.com/mindspore/mindspore/pulls/2420)) -- Third party - - Sqlite : Update sqlite to 3.32.2 to handle [CVE-2020-11656](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11656), [CVE-2020-13871](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13871), [CVE-2020-11655](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11655), [CVE-2020-9327](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-9327), [CVE-2020-13630](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13630), [CVE-2020-15358](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-15358), [CVE-2020-13631](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13631), [CVE-2020-13632](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13632), [CVE-2020-13434](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13434), [CVE-2020-13435](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13435), and [CVE-2020-15358](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11655). - -## Contributors - -Thanks goes to these wonderful people: - -Alexey Shevlyakov, avakh, baihuawei, BowenK, buxue, caifubi, caojian05, Cathy Wong, changzherui, chenfei, chengxianbin, chenhaozhe, chenjianping, chentingting, chenzomi, chujinjin, Danish Farid, dayschan, dengwentao, dinghao, etone-chan, fangzehua, fary86, geekun, Giancarlo Colmenares, gong chen, gukecai, guohongzilong, hangangqiang, heleiwang, hesham, He Wei, hexia, hongxing, huangdongrun, huanghui, islam_amin, Jamie Nisbet, Jesse Lee, jiangjinsheng, jiangzhiwen, jinyaohui, jjfeing, jojobugfree, Jonathan Yan, jonyguo, Junhan Hu, Kang, kingfo, kouzhenzhong, kpy, kswang, laiyongqiang, leopz, liangzelang, lichenever, lihongkang, Li Hongzhang, lilei, limingqi107, lirongzhen1, liubuyu, liuchongming74, liuwenhao4, liuxiao, Lixia Chen, liyanliu, liyong, lizhenyu, lvliang, Mahdi, Margaret_wangrui, meixiaowei, ms_yan, nhussain, ougongchang, panfengfeng, panyifeng, peilinwang, Peilin Wang, pkuliuliu, qianlong, rick_sanchez, shibeiji, Shida He, shijianning, simson, sunsuodong, suteng, Tinazhang, Tron Zhang, unknown, VectorSL, wandongdong, wangcong, wangdongxu, wangdongxu6, wanghua, wangnan39, Wei Luning, wenchunjiang, wenkai, wilfChen, WilliamLian, wukesong, Xian Weizhao, Xiaoda Zhang, xiefangqi, xulei2020, xunxue, xutianchun, Yang, yanghaitao, yanghaitao1, yanghaoran, yangjie, yangjie159, YangLuo, Yanjun Peng, yankai, yanzhenxiang2020, yao_yf, Yi Huaijie, yoonlee666, yuchaojie, yujianfeng, zhangzhongpeng, zhangdengcheng, Zhang Qinghua, zhangyinxia, zhangz0911gm, zhaojichen, zhaoting, zhaozhenlong, zhoufeng, zhouneng, zhousiyi, Zirui Wu, Ziyan, zjun, ZPaC, lihongzhang, wangdongxu - -Contributions of any kind are welcome! - -# MindSpore 0.3.1-alpha Release Notes - -## Major Features and Improvements - -### Ascend Training and Inference Framework - -- Frontend and User Interface - - Independent model init interface. -- Data processing, augmentation, and save format - - Support sample padding for minddataset. - -## Bugfixes - -- Python API - - Fix bugs in the lars optimizer([!1894](https://gitee.com/mindspore/mindspore/pulls/1894)) -- Data processing - - Fix accuracy problem of RandomCropDecodeResize ([!2340](https://gitee.com/mindspore/mindspore/pulls/2340)) - -# Release 0.3.0-alpha - -## Major Features and Improvements - -### Ascend Training and Inference Framework - -- New models - - DeepFM: a factorization-machine based neural network for CTR prediction on Criteo dataset. - - DeepLabV3: significantly improves over our previous DeepLab versions without DenseCRF post-processing and attains comparable performance with other state-of-art models on the PASCAL VOC 2007 semantic image segmentation benchmark. - - Faster-RCNN: towards real-time object detection with region proposal networks on COCO 2017 dataset. - - SSD: a single stage object detection methods on COCO 2017 dataset. - - GoogLeNet: a deep convolutional neural network architecture codenamed Inception V1 for classification and detection on CIFAR-10 dataset. - - Wide&Deep: jointly trained wide linear models and deep neural networks for recommender systems on Criteo dataset. -- Frontend and User Interface - - Complete numpy advanced indexing method. Supports value and assignment through tensor index. - - Some optimizers support separating parameter groups. Different parameter groups can set different `learning_rate` and `weight_decay`. - - Support setting submodule's logging level independently, e.g. you can set logging level of module `A` to warning and set logging level of module `B` to info. - - Support weights to be compiled according to shape to solve the problem of large memory overhead. - - Add some operators implement and grammar support in pynative mode. To be consistent with graph mode. - - User interfaces change log - - Learning rate and weight decay making group params([!637](https://gitee.com/mindspore/mindspore/pulls/637)) - - Support weights to be compiled according to shape([!1015](https://gitee.com/mindspore/mindspore/pulls/1015)) - - delete some context param([!1100](https://gitee.com/mindspore/mindspore/pulls/1100)) - - ImageSummary/ScalarSummary/TensorSummary/HistogramSummary([!1329](https://gitee.com/mindspore/mindspore/pulls/1329))([!1425](https://gitee.com/mindspore/mindspore/pulls/1425)) -- Executor and Performance Optimization - - Support doing evaluation while in training process, so that the accuracy of training can be easily obtained. - - Enable second-order optimization for resnet50, which can achieve 75.9% accuracy in 45 epochs (Resnet50 @ImageNet). - - Optimize pynative implementation and improve it's execution performance. - - Optimize summary record implementation and improve its performance. -- Data processing, augmentation, and save format - - Support simple text processing, such as tokenizer/buildvocab/lookup. - - Support padding batch. - - Support split or concat dataset. - - Support MindDataset reading from file list. - -### Other Hardware Support - -- GPU platform - - New models supported: MobileNetV2, MobileNetV3. - - Support mixed precision training. - - Support device memory swapping. - -## Bugfixes - -- Python API - - An exception to the broadcast input data type check([!712](https://gitee.com/mindspore/mindspore/pulls/712)) - - Fix issues assignsub return value 0([!1036](https://gitee.com/mindspore/mindspore/pulls/1036)) - - Fix issue Conv2dBackpropInput bprop should return 3 instead of 2 items([!1001](https://gitee.com/mindspore/mindspore/pulls/1001)) - - Fix sens shape error of TrainOneStepWithLossScaleCell([!1050](https://gitee.com/mindspore/mindspore/pulls/1050)) - - Fix BatchNormGrad operator([!1344](https://gitee.com/mindspore/mindspore/pulls/1344)) -- Executor - - Fix dropout,topK and addn errors in PyNative mode ([!1285](https://gitee.com/mindspore/mindspore/pulls/1285), [!1138](https://gitee.com/mindspore/mindspore/pulls/1138), [!1033](https://gitee.com/mindspore/mindspore/pulls/1033)). - - Fix memory leaks after execution in PyNatvie mode ([!1201](https://gitee.com/mindspore/mindspore/pulls/1201)). - - Fix HCCL failure in some special scenes ([!1204](https://gitee.com/mindspore/mindspore/pulls/1204), [!1252](https://gitee.com/mindspore/mindspore/pulls/1252)). - - Fix SSD network when Select failed, can't find kernel info([!1449](https://gitee.com/mindspore/mindspore/pulls/1449)). - - Fix Topk operator selection strategy bug between aicore and aicpu([!1367](https://gitee.com/mindspore/mindspore/pulls/1367)). - - Fix input memory size of 'assign' op unequal in control sink mode when assigning a data from one child graph to another child graph([!802](https://gitee.com/mindspore/mindspore/pulls/802)). - - Fix allreduce ir inconsistency([!989](https://gitee.com/mindspore/mindspore/pulls/989)). -- GPU platform - - Fix summary for gradient collection ([!1364](https://gitee.com/mindspore/mindspore/pulls/1364)) - - Fix the slice operator ([!1489](https://gitee.com/mindspore/mindspore/pulls/1489)) -- Data processing - - Fix memory problems of GeneratorDataset of sub-process ([!907](https://gitee.com/mindspore/mindspore/pulls/907)) - - Fix getting data timeout when training the cifar10 dataset under the lenet([!1391](https://gitee.com/mindspore/mindspore/pulls/1391)) - -## Contributors - -Thanks goes to these wonderful people: - -Alexey Shevlyakov, Amir Lashkari, anthony, baihuawei, biffex, buxue, caifubi, candanzg, caojian05, Cathy Wong, changzherui, chenfei, chengxianbin, chenhaozhe, chenzomi, chujinjin, cristoval, dengwentao, eric, etone-chan, fary86, gaojing, gengdongjie, gongchen, guohongzilong, guozhijian, heleiwang, hesham, He Wei, Hoai Linh Tran, hongxing, huangdongrun, huanghui, Jamie Nisbet, Jesse Lee, jiangjinsheng, jiangzhiwen, jinyaohui, jjfeing, jonwe, jonyguo, Junhan Hu, Kang, kingfo, kswang, laiyongqiang, leopz, lichenever, lihongkang, limingqi107, liubuyu, liuliyan2, liuwenhao4, liuxiao, liuxiao, liyong, lizhenyu, lvliang, Margaret_wangrui, meixiaowei, ms_yan, Nat Sutyanyong, ougongchang, panfengfeng, panyifeng, Peilin Wang, peixu_ren, qianlong, rick_sanchez, seatea, sheng, shijianning, simson, sunsuodong, Tinazhang, VectorSL, wandongdong, wangcong, wanghua, wangnan39, Wei Luning, wenchunjiang, wilfChen, WilliamLian, wsc, wukesong, wuxuejian, Xiaoda Zhang, xiefangqi, xulei2020, Yang, yangjie159, yangruoqi713, yangyongjie, yangzhenzhang, Yanjun Peng, yanzhenxiang2020, yao_yf, Yi Huaijie, yoonlee666, yujianfeng, YuJianfeng, yvetteliu, zhangdengcheng, Zhang Qinghua, zhangz0911gm, zhaojichen, zhaoting, zhaozhenlong, zhoufeng, zhouneng, zhousiyi, zhouyuanshen, Zirui Wu, Ziyan, zjun, ZPaC, lihongzhang - -Contributions of any kind are welcome! - -# MindSpore 0.2.0-alpha Release Notes - -## Major Features and Improvements - -### Ascend Training and Inference Framework - -- New models - - MobileNetV2: Inverted Residuals and Linear Bottlenecks. - - ResNet101: Deep Residual Learning for Image Recognition. - -- Frontend and User Interface - - Support for all python comparison operators. - - Support for math operators **,//,%. Support for other python operators like and/or/not/is/is not/ in/ not in. - - Support for the gradients of function with variable arguments. - - Support for tensor indexing assignment for certain indexing type. - - Support for dynamic learning rate. - - User interfaces change log - - DepthwiseConv2dNative, DepthwiseConv2dNativeBackpropFilter, DepthwiseConv2dNativeBackpropInput([!424](https://gitee.com/mindspore/mindspore/pulls/424)) - - ReLU6, ReLU6Grad([!224](https://gitee.com/mindspore/mindspore/pulls/224)) - - GeneratorDataset([!183](https://gitee.com/mindspore/mindspore/pulls/183)) - - VOCDataset([!477](https://gitee.com/mindspore/mindspore/pulls/477)) - - MindDataset, PKSampler([!514](https://gitee.com/mindspore/mindspore/pulls/514)) - - map([!506](https://gitee.com/mindspore/mindspore/pulls/506)) - - Conv([!226](https://gitee.com/mindspore/mindspore/pulls/226)) - - Adam([!253](https://gitee.com/mindspore/mindspore/pulls/253)) - - _set_fusion_strategy_by_idx,_set_fusion_strategy_by_size([!189](https://gitee.com/mindspore/mindspore/pulls/189)) - - CheckpointConfig([!122](https://gitee.com/mindspore/mindspore/pulls/122)) - - Constant([!54](https://gitee.com/mindspore/mindspore/pulls/54)) -- Executor and Performance Optimization - - Support parallel execution of data prefetching and forward/backward computing. - - Support parallel execution of gradient aggregation and forward/backward computing in distributed training scenarios. - - Support operator fusion optimization. - - Optimize compilation process and improve the performance. -- Data processing, augmentation, and save format - - Support multi-process of GeneratorDataset/PyFunc for high performance - - Support variable batchsize - - Support new Dataset operators, such as filter,skip,take,TextLineDataset - -### Other Hardware Support - -- GPU platform - - Use dynamic memory pool by default on GPU. - - Support parallel execution of computation and communication. - - Support continuous address allocation by memory pool. -- CPU platform - - Support for windows 10 OS. - -## Bugfixes - -- Models - - Fix mixed precision bug for VGG16 model ([!629](https://gitee.com/mindspore/mindspore/pulls/629)). -- Python API - - Fix ControlDepend operator bugs on CPU and GPU ([!396](https://gitee.com/mindspore/mindspore/pulls/396)). - - Fix ArgMinWithValue operator bugs ([!338](https://gitee.com/mindspore/mindspore/pulls/338)). - - Fix Dense operator bugs on PyNative mode ([!276](https://gitee.com/mindspore/mindspore/pulls/276)). - - Fix MatMul operator bugs on PyNative mode ([!288](https://gitee.com/mindspore/mindspore/pulls/288)). -- Executor - - Fix operator selection bugs and make it general ([!300](https://gitee.com/mindspore/mindspore/pulls/300)). - - Fix memory reuse bug for GetNext op ([!291](https://gitee.com/mindspore/mindspore/pulls/291)). -- GPU platform - - Fix memory allocation in multi-graph scenarios ([!444](https://gitee.com/mindspore/mindspore/pulls/444)). - - Fix bias_add_grad under fp16 precision ([!598](https://gitee.com/mindspore/mindspore/pulls/598)). - - Fix support for fp16 kernels on nvidia 1080Ti([!571](https://gitee.com/mindspore/mindspore/pulls/571)). - - Fix parsing of tuple type parameters ([!316](https://gitee.com/mindspore/mindspore/pulls/316)). -- Data processing - - Fix TypeErrors about can't pickle mindspore._c_dataengine.DEPipeline objects([!434](https://gitee.com/mindspore/mindspore/pulls/434)). - - Add TFRecord file verification([!406](https://gitee.com/mindspore/mindspore/pulls/406)). - -## Contributors - -Thanks goes to these wonderful people: - -Alexey_Shevlyakov, Cathy, Chong, Hoai, Jonathan, Junhan, JunhanHu, Peilin, SanjayChan, StrawNoBerry, VectorSL, Wei, WeibiaoYu, Xiaoda, Yanjun, YuJianfeng, ZPaC, Zhang, ZhangQinghua, ZiruiWu, amongo, anthonyaje, anzhengqi, biffex, caifubi, candanzg, caojian05, casgj, cathwong, ch-l, chang, changzherui, chenfei, chengang, chenhaozhe, chenjianping, chentingting, chenzomi, chujinjin, dengwentao, dinghao, fanglei, fary86, flywind, gaojing, geekun, gengdongjie, ghzl, gong, gongchen, gukecai, guohongzilong, guozhijian, gziyan, h.farahat, hesham, huangdongrun, huanghui, jiangzhiwen, jinyaohui, jjfeing, jojobugfree, jonathan_yan, jonyguo, jzw, kingfo, kisnwang, laiyongqiang, leonwanghui, lianliguang, lichen, lichenever, limingqi107, liubuyu, liuxiao, liyong, liyong126, lizhenyu, lupengcheng, lvliang, maoweiyong, ms_yan, mxm, ougongchang, panfengfeng, panyifeng, pengyanjun, penn, qianlong, seatea, simson, suteng, thlinh, vlne-v1, wangchengke, wanghua, wangnan39, wangqiuliang, wenchunjiang, wenkai, wukesong, xiefangqi, xulei, yanghaitao, yanghaoran, yangjie159, yangzhenzhang, yankai10, yanzhenxiang2020, yao_yf, yoonlee666, zhangbuxue, zhangz0911gm, zhangzheng, zhaojichen, zhaoting, zhaozhenlong, zhongligeng, zhoufeng, zhousiyi, zjun, zyli2020, yuhuijun, limingqi107, lizhenyu, chenweifeng. - -Contributions of any kind are welcome! - -# MindSpore 0.1.0-alpha Release Notes - -## Main Features - -### Ascend Training and Inference Framework - -- Recommended OS: Ubuntu 16.04 (or later) or EulerOS 2.5 or EulerOS 2.8 -- Python version: 3.7.5 -- Preset models - - ResNet-50: residual structure-based convolutional neural network (CNN) for image classification, which is widely used. - - AlexNet: classic CNN for image classification, achieving historical results in ImageNet LSVRC-2012. - - LeNet: classic CNN for image classification, which was proposed by Yann LeCun. - - VGG16: classic CNN for image classification, which was proposed by Oxford Visual Geometry Group. - - YoloV3: real-time object detection network. - - NEZHA: BERT-based Chinese pre-training network produced by Huawei Noah's Ark Laboratory. -- Execution modes - - Graph mode: provides graph optimization methods such as memory overcommitment, IR fusion, and buffer fusion to achieve optimal execution performance. - - PyNative mode: single-step execution mode, facilitating process debugging. -- Debugging capability and methods - - Save CheckPoints and Summary data during training. - - Support asynchronous printing. - - Dump the computing data. - - Support profiling analysis of the execution process performance. -- Distributed execution - - Support AllReduce, AllGather, and BroadCast collective communication. - - AllReduce data parallel: Each device obtains different training data, which accelerates the overall training process. - - Collective communication-based layerwise parallel: Models are divided and allocated to different devices to solve the problem of insufficient memory for large model processing and improve the training speed. - - Automatic parallel mode: The better data and model parallel mode can be predicted based on the cost model. It is recommended that this mode be used on ResNet series networks. -- Automatic differentiation - - Implement automatic differentiation based on Source to Source. - - Support distributed scenarios and automatic insertion of reverse communication operators. -- Data processing, augmentation, and save format - - Load common datasets such as ImageNet, MNIST, CIFAR-10, and CIFAR-100. - - Support common data loading pipeline operations, such as shuffle, repeat, batch, map, and sampler. - - Provide basic operator libraries to cover common CV scenarios. - - Support users to customize Python data augmentation operators through the Pyfunc mechanism. - - Support the access of user-defined datasets through the GeneratorDataset mechanism. - - Provide the MindSpore data format, data aggregation and storage, random access example, data partition, efficient parallel read, user-defined index, and dataset search. - - Convert user datasets to the MindSpore data format. - - After data processing and augmentation, provide training applications in feed and graph modes. -- FP32/16 mixed precision computation, supporting automatic and manual configuration -- Provide common operators such as nn, math, and array, which can be customized. - -### Inference Deployment - -- Deploy models in MindSpore format on the Ascend 310 platform for inference. -- Save models in ONNX format. -- Support saving models in LITE format and running models based on the lightweight inference framework. - - Recommended OS: Android 4.3 or later - - Supported network type: LeNet - - Provide the generalization operators generated by TVM and operators generated after specific networks are tuned. - -### Other Hardware Support - -- GPU platform training - - Recommended OS: Ubuntu 16.04 - - CUDA version: 9.2 or 10.1 - - CuDNN version: 7.6 or later - - Python version: 3.7.5 - - NCCL version: 2.4.8-1 - - OpenMPI version: 3.1.5 - - Supported models: AlexNet, LeNet, and LSTM - - Supported datasets: MNIST and CIFAR-10 - - Support data parallel. -- CPU platform training - - Recommended OS: Ubuntu 16.04 - - Python version: 3.7.5 - - Supported model: LeNet - - Supported dataset: MNIST - - Provide only the stand-alone operation version. - -## Peripherals and Tools - -- [MindSpore Official Website](https://www.mindspore.cn/) -- [MindInsight Visualization Debugging and Optimization](https://gitee.com/mindspore/mindinsight) -- [MindArmour Model Security Hardening Package](https://gitee.com/mindspore/mindarmour) -- [GraphEngine Computational Graph Engine](https://gitee.com/mindspore/graphengine) diff --git a/RELEASE_CN.md b/RELEASE_CN.md index 27db24fe9e98249f2b645cff5e81d334787c2651..75861599ab1839c2e5447e8fa61736251a5235b9 100644 --- a/RELEASE_CN.md +++ b/RELEASE_CN.md @@ -1,4361 +1,165 @@ -# MindSpore Release Notes +# MindSpore Lite Release Notes [View English](./RELEASE.md) -## MindSpore 2.6.0 Release Notes - -### 主要特性及增强 - -#### Dataset - -- [STABLE] [MindDataset](https://www.mindspore.cn/docs/zh-CN/master/api_python/dataset/mindspore.dataset.MindDataset.html)接口分片采样行为由原来的按块采样(链接中的数据分片的策略2)变更为间隔采样(链接中的数据分片的策略1),用户可以通过 MS_DEV_MINDRECORD_SHARD_BY_BLOCK 环境变量,控制是否切换回按块采样。 -- [STABLE] GeneratorDataset 支持 spawn 方式启动多进程,支持在多进程时,使用Ascend后端的数据增强方法。用户可以设置 [mindspore.dataset.config.set_multiprocessing_start_method("spawn")](https://www.mindspore.cn/docs/zh-CN/master/api_python/dataset/mindspore.dataset.config.set_multiprocessing_start_method.html) 以 spawn 的方式启动多进程。 -- [STABLE] [MindDataset](https://www.mindspore.cn/docs/zh-CN/master/api_python/dataset/mindspore.dataset.MindDataset.html) 的 `shuffle` 参数新增了 `Shuffle.ADAPTIVE` 行为,根据样本数量自适应调整 shuffle 样本数量的策略以降低训练内存开销,减少 OOM 风险。若期望强制采用全局 shuffle,可以指定为 `Shuffle.GLOBAL`,用户需确保机器内存足够。 - -#### Ascend - -- [STABLE] 动态图模式场景,[ops.Custom](https://www.mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.Custom.html) 原语接入Ascend C自定义算子,支持多输出类型,`ops.Custom`支持C++侧infer type。 -- [BETA] 动态图模式场景,新增CustomOpBuilder支持在线编译和加载自定义算子。 -- [STABLE] 使用O1编译选项时,支持用户控制图算融合优化范围,用户通过环境变量MS_DEV_GRAPH_KERNEL_FLAGS的enable_fusion_pattern_only/disable_fusion_pattern选项,控制打开或者关闭对应融合pattern,同时支持通过--path=example.json方式读取文件配置。 -- [STABLE] 支持通过 [mindspore.device_context.ascend.op_debug.aclinit_config](https://www.mindspore.cn/docs/zh-CN/master/api_python/device_context/mindspore.device_context.ascend.op_debug.aclinit_config.html) 接口,设置aclop算子缓存信息老化配置和错误信息上报模式配置。 -- [STABLE] GE后端仅支持整图下沉和lazy inline子图下沉,其他场景不再支持。 -- [BETA] 静态图O0/O1模式场景,`mindspore.nn.Cell`基类新增offload接口与backward_prefetch接口属性,用户可通过[Cell.offload(backward_prefetch)](https://www.mindspore.cn/docs/zh-CN/master/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.offload) 使用该接口,在训练正向阶段,将特定`Cell`类内的激活值从device侧卸载至host侧,并在训练反向阶段,将激活值从host侧提前预取回device侧。 - -#### Parallel - -- [STABLE] 分布式pdb调试,支持动态图和静态图,更推荐使用动态图。 -- [STABLE] 新增接口[mindspore.communication.get_comm_name](https://www.mindspore.cn/docs/zh-CN/master/api_python/communication/mindspore.communication.get_comm_name.html),用户可以通过该接口查询HCCL集合通信库底层通信器名称。 -- [STABLE] 新增 [AutoParallel](https://www.mindspore.cn/docs/zh-CN/master/api_python/parallel/mindspore.parallel.auto_parallel.AutoParallel.html) 接口,支持对单个网络进行并行配置,解决并行配置作用域过大的问题。 -- [STABLE] seqpipe新增支持两种调度方式seqvpp、seqsmartvpp,显著降低seqpipe结合vpp场景下的显存开销。 -- [STABLE] 静态图模式场景,支持zero2/zero3级别的内存优化,降低有纯dp训练需求的模型的显存开销。 -- [STABLE] 静态图模式场景,支持流水线并行下的1b1f通信掩盖,提升流水线并行性能。 -- [STABLE] 静态图模式场景,支持张量模型并行和专家模型并行下的反向通信掩盖,提升模型训练性能。 -- [STABLE] 静态图模式场景,自动并行策略传播模式更新为优先传播算子的Layout策略,提高策略传播准确性。 -- [STABLE] 静态图模式场景,自动并行支持使用 [mindspore.parallel.shard](https://www.mindspore.cn/docs/zh-CN/master/api_python/parallel/mindspore.parallel.shard.html) 接口为mint算子配置策略,优化了多输入算子的策略。 -- [STABLE]支持强化学习场景中,DP/MP/PP 多维混合并行模式下的训推权重在线权重重排。 -- [STABLE] 支持用户查询分布式模块是否可用和通信模块是否初始化功能,用户可以通过 [mint.distributed.is_available](https://www.mindspore.cn/docs/zh-CN/master/api_python/mint/mindspore.mint.distributed.is_available.html) 接口查询分布式模块是否可用,以及通过 [mint.distributed.is_initialized](https://www.mindspore.cn/docs/zh-CN/master/api_python/mint/mindspore.mint.distributed.is_initialized.html) 接口查询通信模块是否初始化。 -- [STABLE] 静态图模式场景,支持 `AlltoAllV` 正反向算子,用户可通过 [ops.AlltoAllV](https://www.mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.AlltoAllV.html) 接口使用该算子。 -- [STABLE] 支持CPU通信接口[mindspore.mint.distributed.allreduce](https://www.mindspore.cn/docs/zh-CN/master/api_python/mint/mindspore.mint.distributed.all_reduce.html#mindspore.mint.distributed.all_reduce)、[mindspore.mint.distributed.barrier](https://www.mindspore.cn/docs/zh-CN/master/api_python/mint/mindspore.mint.distributed.barrier.html#mindspore.mint.distributed.barrier)、[mindspore.mint.distributed.send](https://www.mindspore.cn/docs/zh-CN/master/api_python/mint/mindspore.mint.distributed.send.html#mindspore.mint.distributed.send)、[mindspore.mint.distributed.recv](https://www.mindspore.cn/docs/zh-CN/master/api_python/mint/mindspore.mint.distributed.recv.html#mindspore.mint.distributed.recv),用户可通过这些接口使用对应的集合通信算子功能。 - -#### Inference - -- [STABLE] 支持DeepSeek-V3/R1大模型的BFloat16全精度推理和W8A8量化推理,并为提升其推理性能开发或优化了RmsNormQuant、MatMul+Sigmoid+Add、Transpose+BatchMatMul+Transpose等12个融合算子。 -- [BETA] 支持使用MindIE和MindSpore Transformers大模型套件,服务化部署DeepSeek-V3/R1。 -- [STABLE] 优化了使用MindIE和MindSpore Transformers大模型套件进行推理服务部署时,加载safetensors的过程,实现了GE按需初始化,分别降低了内存占用量和启动耗时。 -- [BETA] 支持使用[vLLM-MindSpore](https://gitee.com/mindspore/vllm-mindspore)插件和vLLM v0.6.6.post1,服务化部署DeepSeek-V3/R1、Qwen2.5大模型。 - -#### profiler - -- [STABLE] 支持获取通信域并行策略信息,并行策略信息支持可视化显示,提升集群场景下性能定位效率。 -- [STABLE] 动态profiling支持轻量化打点,用户可动态开启轻量化打点,实时查看性能数据。 -- [STABLE] Profiler轻量化打点能力增强,支持dataloader、save checkpoint等关键阶段轻量化打点信息。 -- [STABLE] Profiler支持查看memory_access相关aicore metric信息。 -- [STABLE] Profiler支持[mindspore.profiler.profile](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.profiler.profile.html)、[_ExperimentalConfig](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.profiler._ExperimentalConfig.html) 和[tensorboard_trace_handler](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.profiler.tensorboard_trace_handler.html),提升工具易用性。 -- [STABLE] 动态profiling支持内存采集,用户可动态开启内存数据采集,提升工具易用性。 - -#### Compiler - -- [BETA] 图模式支持inplace和view算子正向表达能力。 - -### API 变更 - -#### 新增API - -- [DEMO] [mindspore.mint](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.mint.html) API新增了大量的functional、nn接口。mint接口当前是实验性接口,在图编译模式为O0/O1和PyNative模式下性能比ops更优。当前暂不支持O2编译模式(图下沉)及CPU、GPU后端,后续会逐步完善。 - - | mindspore.mint | - | :------------------------------ | - | mindspore.mint.reshape | - | mindspore.mint.triangular_solve | - | mindspore.mint.index_add | - | mindspore.mint.logaddexp2 | - | mindspore.mint.diag | - - | mindspore.mint.nn | - | :----------------------------- | - | mindspore.mint.nn.Sigmoid | - | mindspore.mint.nn.Conv2d | - | mindspore.mint.nn.PixelShuffle | - - | mindspore.mint.nn.functional | - | :----------------------------------------------- | - | mindspore.mint.nn.functional.adaptive_avg_pool3d | - | mindspore.mint.nn.functional.conv2d | - | mindspore.mint.nn.functional.avg_pool3d | - | mindspore.mint.nn.functional.elu_ | - | mindspore.mint.nn.functional.pixel_shuffle | - - | others | - | ------------------------ | - | mindspore.mint.optim.SGD | - | mindspore.mint.linalg.qr | - -- [STABLE] [mindspore.mint](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.mint.html) API 也提供了一些新增的stable接口。此外, 一些demo的接口也转为了stable。 - - | mindspore.mint | - | :----------------------- | - | mindspore.mint.full_like | - | mindspore.mint.log2 | - | mindspore.mint.isneginf | - - | mindspore.mint.nn | - | :-------------------------- | - | mindspore.mint.nn.GLU | - | mindspore.mint.nn.KLDivLoss | - - | mindspore.mint.nn.functional | - | :---------------------------------- | - | mindspore.mint.nn.functional.glu | - | mindspore.mint.nn.functional.kl_div | - - | mindspore.Tensor | - | :------------------------ | - | mindspore.Tensor.isneginf | - | mindspore.Tensor.log2 | - -- [DEMO] [mindspore.Tensor](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor) API新增了大量的Tensor方法的接口。当前仍属于实验性接口,当前暂不支持图下沉模式及CPU、GPU后端,后续会逐步完善。详细见[官网接口列表](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor)。 -- [STABLE] [mindspore.ops](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.ops.html) API新增[mindspore.ops.moe_token_permute](https://www.mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.moe_token_permute.html#mindspore.ops.moe_token_permute) 和 [mindspore.ops.moe_token_unpermute](https://www.mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.moe_token_unpermute.html#mindspore.ops.moe_token_unpermute)两个推理算子接口,当前仅支持Ascend后端。 -- [STABLE] [mindspore.mint.nn.functional.gelu](https://www.mindspore.cn/docs/zh-CN/master/api_python/mint/mindspore.mint.nn.functional.gelu.html) 和 [mindspore.mint.nn.GeLU](https://www.mindspore.cn/docs/zh-CN/master/api_python/mint/mindspore.mint.nn.GELU.html) 新增支持了入参 "approximate"。 -- [STABLE] 新增离线解析接口[mindspore.profiler.profiler.analyse](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.profiler.profiler.analyse.html)。 - -#### 非兼容性接口变更 - -- [mindspore.ops.Xlogy](https://www.mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.Xlogy.html) 接口入参input和other移除了对非Tensor输入的支持![(!81625)](https://gitee.com/mindspore/mindspore/pulls/81625) - - - - - - - - - -
2.5.0 2.6.0
-  ops.Xlogy(input [Tensor, numbers.Number, bool],
-            other [Tensor, numbers.Number, bool])
-  
-
-  ops.Xlogy(input [Tensor],
-            other [Tensor])
-  
-
- -- `&`运算符在Ascend后端PyNative模式下不再支持uint32、uint64类型的Tensor输入,`^`运算符在Ascend后端PyNative模式下不再支持uint16、uint32、uint64类型的Tensor输入,`|`运算符在Ascend后端PyNative模式 `tensor | scalar`的场景下不再支持uint16、uint32、uint64类型的Tensor输入。[(!81625)](https://gitee.com/mindspore/mindspore/pulls/81625) -- `%`运算符在CPU和GPU后端不再支持uint16、uint32、uint64类型的Tensor输入。[(!81625)](https://gitee.com/mindspore/mindspore/pulls/81625) -- [mindspore.jit](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.jit.html) 接口参数变更。[(!80248)](https://gitee.com/mindspore/mindspore/pulls/80248) - - 参数 `fn` 名称变更为 `function` 。 - - 移除参数 `mode` 、 `input_signature` 、 `hash_args` 、 `jit_config` 和 `compile_once` 。 - - 新增参数 `capture_mode` ,设置编译成MindSpore图的方式。 - - - - - - - - - -
2.5.0 2.6.0
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(mode="PIJit")
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(capture_mode="bytecode")
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
- - 新增参数 `jit_level` ,设置编译优化的级别。 - - - - - - - - - -
2.5.0 2.6.0
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit, JitConfig
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(jit_config=JitConfig(jit_level="O0"))
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(jit_level="O0")
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
- - 新增参数 `dynamic` ,设置是否需要进行动态shape编译。 - - - - - - - - - -
2.5.0 2.6.0
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(dynamic=1)
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
- - 新增参数 `fullgraph` ,设置是否捕获整个函数来编译成图。 - - - - - - - - - -
2.5.0 2.6.0
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit, JitConfig
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(jit_config=JitConfig(jit_syntax_level="STRICT"))
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(fullgraph=True)
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
- - 新增参数 `backend` ,设置使用的编译后端。 - - - - - - - - - -
2.5.0 2.6.0
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(backend="ms_backend")
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
- - 新增参数 `options` ,设置传给编译后端的选项字典。 - - - - - - - - - -
2.5.0 2.6.0
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit, JitConfig
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(jit_config=JitConfig(infer_boost="on"))
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
-  >>> import numpy as np
-  >>> from mindspore import Tensor, jit
-  >>>
-  >>> x = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>> y = Tensor(np.ones([1, 1, 3, 3]).astype(np.float32))
-  >>>
-  >>> @jit(infer_boost="on")
-  ... def tensor_add_with_dec(x, y):
-  ...     z = x + y
-  ...     return z
-  ...
-  >>> out = tensor_add_with_dec(x, y)
-  
-
- -- `mindspore.profiler.tensor_board_trace_handler` 接口变更。 - - `mindspore.profiler.tensor_board_trace_handler`接口变更为 [mindspore.profiler.tensorboard_trace_handler](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.profiler.tensorboard_trace_handler.html)。 - - - - - - - - - -
2.5.0 2.6.0
-  >>> from mindspore.profiler import tensor_board_trace_handler
-  
-
-  >>> from mindspore.profiler import tensorboard_trace_handler
-  
-
- -- `mindspore.set_context`接口变更。 - - 参数 `ascend_config` 中的 `exception_dump`字段变更为 [device_context.ascend.op_debug.aclinit_config](https://www.mindspore.cn/docs/zh-CN/master/api_python/device_context/mindspore.device_context.ascend.op_debug.aclinit_config.html) 中的 `"dump"`字段。 - - - - - - - - - -
2.5.0 2.6.0
-  >>> import mindspore as ms
-  >>> ms.set_context(ascend_config = {"exception_dump": "2"})
-  
-
-  >>> import mindspore as ms
-  >>> ms.device_context.ascend.op_debug.aclinit_config({"dump": {"dump_scene": "lite_exception"}})
-  
-
- -- `mindspore.Tensor`打印内容变更。 - - 原有Tensor打印内容,只打印值,新Tensor打印内容包含shape和dtype等Tensor关键信息。 - - - - - - - - - -
2.5.0 2.6.0
-  >>> import mindspore as ms
-  >>> tensor = ms.Tensor([1,1,1], dtype=ms.float32)
-  >>> print(tensor)
-  [1. 1. 1.]
-  
-
-  >>> import mindspore as ms
-  >>> tensor = ms.Tensor([1,1,1], dtype=ms.float32)
-  >>> print(tensor)
-  Tensor(shape=[3], dtype=Float32, value= [ 1.00000000e+00,  1.00000000e+00,  1.00000000e+00])
-  
-
- -- 静态图模式,Ascend后端,jit_level为O2模式下Dump接口变更。 - - 在静态图Ascend 后端 jit_level为O2场景下,环境变量 `MINDSPORE_DUMP_CONFIG`和 `ENABLE_MS_GE_DUMP`已废弃,Dump相关功能已迁移到 msprobe 工具,更多详情请查看[《msprobe 工具 MindSpore 场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 - -### 贡献者 - -amyMaYun,Ava,baishanyang,br_fix_save_strategy_ckpt,caifubi,caoxubo,cccc1111,ccsszz,chaijinwei,chaiyouheng,changzherui,chengbin,chopupu,chujinjin,congcong,dairenjie,DavidFFFan,DeshiChen,dingjinshan,fary86,fengqiang,fengyixing,ffmh,fuhouyu,Gallium,gaoshuanglong,gaoyong10,geyuhong,guoyq16,guoyuzhe,GuoZhibin,guozhijian,gupengcheng0401,hangq,Hanshize,haozhang,hedongdong,hhz886,HighCloud,horcham,huangbingjian,huangxiang360729,huangzhichao2023,huangzhuo,huangziling,huda,Huilan Li,hujiahui8,huoxinyou,jiangchao_j,jiangchenglin3,jiangshanfeng,jiaorui,jiaxueyu,jizewei,jjfeing,JoeyLin,jshawjc,kairui_kou,kakyo82,kisnwang,leida,lianghongrui,LiangZhibo,LiaoTao_Wave,lichen,limingqi107,LiNuohang,linux,litingyu,liubuyu,liuchuting,liuluobin,liuyanwei,LLLRT,looop5,luochao60,luojianing,luoxuewei,luoyang,lyk,maoyuanpeng1,Margaret_wangrui,mengxian,MengXiangyu,mylinchi,NaCN,panzhihui,pengqi,PingqiLi,pipecat,qiuyufeng,qiuzhongya,qiwenlun,r1chardf1d0,rachel0858,rainyhorse,Rudy_tan,shaoshengqi,shen_haochen,shenhaojing,shenwei41,shiro-zzz,shuqian0,stavewu,TAJh,tanghuikang,tangmengcheng,tongdabao,TuDouNi,VectorSL,wang_ziqi,wangjie,wangliaohui97,wangpingan,wangyibo,weiyang,wja,wudongkun,wujueying,wuweikang,wwwbby,xfan233,XianglongZeng,xiaopeng,xiaotianci,xiaoyao,xiedejin1,XinDu,xuxinglei,xuzhen,xuzixiang,yang guodong,yangben,yanghaoran,yangruoqi713,yangzhenzhang,yanx,Yanzhi_YI,yao_yf,yide12,yihangchen,YijieChen,yonibaehr,Youmi,yuanqi,yuchaojie,yuezenglin,Yuheng Wang,YuJianfeng,YukioZzz,ZeyuHan,zhaiyukun,Zhang QR,zhangbuxue,zhangdanyang,zhangshucheng,zhangyinxia,ZhangZGC,zhangzhen,zhengzuohe,zhouyaqiang0,zichun_ye,zlq2020,zong_shuai,ZPaC,zyli2020,舛扬,范吉斌,冯一航,胡犇,胡彬,宦晓玲,简云超,李栋,李良灿,李林杰,李寅杰3,刘思铭,刘勇琪,刘子涵,梅飞要,任新,十一雷,孙昊辰,王泓皓,王禹程,王振邦,熊攀,俞涵,虞良斌,云骑士,张栩浩,赵文璇,周一航 - -## MindSpore Lite 2.6.0 Release Notes - -### 主要特性及增强 - -- [STABLE] MindSpore Lite支持模型转换时配置算子并行推理加速,只需在模型转换时配置stream_label_file选项,指定需要进行并行推理的算子。 -- [STABLE] MindSpore Lite支持在昇腾后端下转换onnx控制流中的if算子。 - -### API 变更 - -- [STABLE] acl模型转换配置中,ascend_context选项下新增stream_label_file选项,用于启用多流并行。 - -### 贡献者 - -熊攀,ZhangZGC,yanghaoran,李林杰,shenwei41,xiaotianci,panzhihui,guozhijian,胡彬,tangmengcheng,XianglongZeng,cccc1111,stavewu,刘思铭,r1chardf1d0,jiangshanfeng - -## MindSpore 2.5.0 Release Notes - -### 主要特性及增强 - -#### 分布式启动组件msrun - -- [STABLE] msrun支持传入节点的hostname(如localhost)作为 `--master_addr`,提升msrun易用性。 -- [STABLE] msrun支持将训练日志打印到标准输出,用户可通过 `--tail_worker_log` 参数控制要打印哪些rank。 -- [STABLE] 设置 `export VLOG_v=12500` 后, `scheduler` 日志能输出集群信息,帮助用户快速统计集群数据。 -- [STABLE] msrun支持通过 `--worker_log_name` 参数来格式化日志文件名,帮助用户快速定位到问题节点。 - -详情可参考[msrun启动](https://www.mindspore.cn/docs/zh-CN/r2.5.0/model_train/parallel/msrun_launcher.html)。 - -#### Profiler - -- [STABLE] 新增[mindspore.profiler.schedule](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mindspore/mindspore.profiler.schedule.html)和[mindspore.profiler.tensor_board_trace_handler](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mindspore/mindspore.profiler.tensor_board_trace_handler.html)接口,支持动态图场景按step采集和呈现,提升动态图场景易用性。 -- [STABLE] 动态Profiling支持自定义for循环,提升动态图场景易用性。 -- [STABLE] Profiler初始化参数和交付件目录结构优化,降低用户迁移难度。 -- [STABLE] 新增轻量化打点接口[mindspore.profiler.mstx](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mindspore/mindspore.profiler.mstx.html),提供用户低开销性能数据采集方式。 -- [STABLE] Timeline支持展示硬件利用率数据,帮助用户定位降频问题。 - -详情可参考[Ascend性能调优](https://www.mindspore.cn/docs/zh-CN/r2.5.0/model_train/optimize/profiler.html)。 - -#### 动态图 - -- [BETA] 动态图支持算子的原地操作,引入可以原地操作的算子。以[mindspore.mint.nn.functional.relu](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mint/mindspore.mint.nn.functional.relu.html)算子为列,如果你想使用原地更新版本的relu算子,则可以调用[mindspore.mint.nn.functional.relu_](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mint/mindspore.mint.nn.functional.relu_.html)算子。 -- [STABLE] 使能环境变量MS_SIMULATION_LEVEL=1开启动态图dryrun,支持不占卡模拟多卡进程,通过日志查看显存占用情况。详情可参考[环境变量](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/env_var_list.html?highlight=ms_simulation_level#%E5%88%86%E5%B8%83%E5%BC%8F%E5%B9%B6%E8%A1%8C)。 - -#### FrontEnd - -- [STABLE] 新增[mindspore.nn.utils.no_init_parameters](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/nn/mindspore.nn.utils.no_init_parameters.html)接口,支持网络参数延迟初始化,缩短推理场景下模型启动时间。 - -### API 变更 - -#### 新增API - -- [DEMO] [mindspore.mint](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mindspore.mint.html) API新增了大量的functional、nn接口。mint接口当前是实验性接口,在图编译模式为O0/O1和PyNative模式下性能比ops更优。当前暂不支持O2编译模式(图下沉)及CPU、GPU后端,后续会逐步完善。 - - | mindspore.mint | | | | - | :-------------------------- | :------------------------ | :-------------------------------- | :------------------------- | - | mindspore.mint.bernoulli | mindspore.mint.bincount | mindspore.mint.clone | mindspore.mint.einsum | - | mindspore.mint.empty | mindspore.mint.empty_like | mindspore.mint.full_like | mindspore.mint.randint | - | mindspore.mint.randint_like | mindspore.mint.randn | mindspore.mint.randn_like | mindspore.mint.randperm | - | mindspore.mint.chunk | mindspore.mint.concat | mindspore.mint.count_nonzero | mindspore.mint.scatter | - | mindspore.mint.select | mindspore.mint.squeeze | mindspore.mint.swapaxes | mindspore.mint.transpose | - | mindspore.mint.triu | mindspore.mint.unbind | mindspore.mint.unique_consecutive | mindspore.mint.multinomial | - | mindspore.mint.addmv | mindspore.mint.diff | mindspore.mint.exp2 | mindspore.mint.float_power | - | mindspore.mint.fix | mindspore.mint.fmod | mindspore.mint.frac | mindspore.mint.lerp | - | mindspore.mint.log2 | mindspore.mint.log10 | mindspore.mint.logaddexp | mindspore.mint.mv | - | mindspore.mint.nansum | mindspore.mint.nan_to_num | mindspore.mint.polar | mindspore.mint.ravel | - | mindspore.mint.outer | mindspore.mint.softmax | mindspore.mint.t | mindspore.mint.cdist | - | mindspore.mint.amax | mindspore.mint.amin | mindspore.mint.cumprod | mindspore.mint.histc | - | mindspore.mint.logsumexp | mindspore.mint.norm | mindspore.mint.std | mindspore.mint.std_mean | - | mindspore.mint.var | mindspore.mint.var_mean | mindspore.mint.allclose | mindspore.mint.argsort | - | mindspore.mint.equal | mindspore.mint.isinf | mindspore.mint.isneginf | mindspore.mint.not_equal | - | mindspore.mint.addbmm | mindspore.mint.addmm | mindspore.mint.baddbmm | mindspore.mint.dot | - | mindspore.mint.meshgrid | mindspore.mint.mm | | | - - | mindspore.mint.nn | | - | :---------------------------------- | ---------------------------------- | - | mindspore.mint.nn.Conv3d | mindspore.mint.nn.ConstantPad1d | - | mindspore.mint.nn.ConvTranspose2d | mindspore.mint.nn.ConstantPad2d | - | mindspore.mint.nn.BatchNorm1d | mindspore.mint.nn.ConstantPad3d | - | mindspore.mint.nn.BatchNorm2d | mindspore.mint.nn.ReflectionPad1d | - | mindspore.mint.nn.BatchNorm3d | mindspore.mint.nn.ReflectionPad2d | - | mindspore.mint.nn.LayerNorm | mindspore.mint.nn.ReflectionPad3d | - | mindspore.mint.nn.SyncBatchNorm | mindspore.mint.nn.ReplicationPad1d | - | mindspore.mint.nn.ELU | mindspore.mint.nn.ZeroPad1d | - | mindspore.mint.nn.GELU | mindspore.mint.nn.ZeroPad2d | - | mindspore.mint.nn.LogSigmoid | mindspore.mint.nn.ZeroPad3d | - | mindspore.mint.nn.ReLU6 | mindspore.mint.nn.BCELoss | - | mindspore.mint.nn.SiLU | mindspore.mint.nn.CrossEntropyLoss | - | mindspore.mint.nn.Tanh | mindspore.mint.nn.NLLLoss | - | mindspore.mint.nn.Embedding | mindspore.mint.nn.SmoothL1Loss | - | mindspore.mint.nn.Dropout2d | mindspore.mint.nn.Upsample | - | mindspore.mint.nn.AdaptiveAvgPool1d | mindspore.mint.nn.MaxUnpool2d | - | mindspore.mint.nn.AdaptiveAvgPool2d | | - - | mindspore.mint.nn.functional | - | :----------------------------------------------- | - | mindspore.mint.nn.functional.adaptive_avg_pool1d | - | mindspore.mint.nn.functional.adaptive_avg_pool2d | - | mindspore.mint.nn.functional.avg_pool1d | - | mindspore.mint.nn.functional.max_unpool2d | - | mindspore.mint.nn.functional.logsigmoid | - | mindspore.mint.nn.functional.relu6 | - | mindspore.mint.nn.functional.relu_ | - | mindspore.mint.nn.functional.normalize | - | mindspore.mint.nn.functional.dropout2d | - | mindspore.mint.nn.functional.nll_loss | - | mindspore.mint.nn.functional.smooth_l1_loss | - | mindspore.mint.nn.functional.interpolate | - | mindspore.mint.nn.functional.conv3d | - - | mindspore.mint.distributed | | - | ------------------------------------------------- | -------------------------------------------------- | - | mindspore.mint.distributed.all_gather | mindspore.mint.distributed.get_global_rank | - | mindspore.mint.distributed.all_gather_into_tensor | mindspore.mint.distributed.get_group_rank | - | mindspore.mint.distributed.all_gather_object | mindspore.mint.distributed.get_process_group_ranks | - | mindspore.mint.distributed.all_reduce | mindspore.mint.distributed.init_process_group | - | mindspore.mint.distributed.all_to_all | mindspore.mint.distributed.irecv | - | mindspore.mint.distributed.all_to_all_single | mindspore.mint.distributed.isend | - | mindspore.mint.distributed.barrier | mindspore.mint.distributed.new_group | - | mindspore.mint.distributed.batch_isend_irecv | mindspore.mint.distributed.P2POp | - | mindspore.mint.distributed.broadcast | mindspore.mint.distributed.recv | - | mindspore.mint.distributed.broadcast_object_list | mindspore.mint.distributed.reduce | - | mindspore.mint.distributed.gather | mindspore.mint.distributed.reduce_scatter | - | mindspore.mint.distributed.gather_object | mindspore.mint.distributed.reduce_scatter_tensor | - | mindspore.mint.distributed.get_backend | mindspore.mint.distributed.scatter | - | mindspore.mint.distributed.scatter_object_list | mindspore.mint.distributed.send | - - | others | - | --------------------------------- | - | mindspore.mint.optim.Adam | - | mindspore.mint.linalg.matrix_norm | - | mindspore.mint.linalg.norm | - | mindspore.mint.linalg.vector_norm | - | mindspore.mint.special.exp2 | - -- [STABLE] 新增[mindspore.ops.incre_flash_attention](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/ops/mindspore.ops.incre_flash_attention.html)和[mindspore.ops.prompt_flash_attention](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/ops/mindspore.ops.prompt_flash_attention.html)两个推理算子接口,当前仅支持Ascend后端。 -- [STABLE] [mindspore.runtime](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mindspore.runtime.html)替代原mindspore.hal接口,提供了流、显存、Event等运行时资源相关的接口。 -- [STABLE] [mindspore.device_context](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mindspore.device_context.html)替代原set_context接口部分参数,提供了硬件平台相关的设置接口。 -- [DEMO] [mindspore.Tensor](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor) API新增了大量的Tensor方法的接口。当前仍属于实验性接口,当前暂不支持图下沉模式及CPU、GPU后端,后续会逐步完善。同时大量的存量Tensor接口,包括+=、-=、*=、/=等运算符,通过重载的方式在Ascend后端接入了Aclnn算子。详细见[官网接口列表](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor)。 - -#### 非兼容性接口变更 - -- [mindspore.Tensor.new_ones](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/mindspore/Tensor/mindspore.Tensor.new_zeros.html)接口的size入参取消对Tensor类型的支持。 -- mindspore.Profiler删除参数timeline_limit、rank_id、analyse_only、env_enable。 - -- 接口名称:mindspore.Profiler - - 变更内容:废弃profile_communication参数,通过设置profiler_level=ProfilerLevel.Level1或profiler_level=ProfilerLevel.Level2采集通讯矩阵数据。 - - 说明:profiler_level默认值为ProfilerLevel.Level0。 - - - - - - - - - -
原接口 v2.5.0接口
-  Profiler(profile_communication=True)
-  
-
-  Profiler(profiler_level=ProfilerLevel.Level1)或Profiler(profiler_level=ProfilerLevel.Level2)
-  
-
- -- 接口名称:mindspore.Profiler - - 变更内容:废弃op_time参数,通过设置activaties=[mindspore.profiler.ProfilerActivity.NPU]设置采集NPU侧算子性能数据。 - - 说明:activaties参数类型为列表,只要包含mindspore.profiler.ProfilerActivity.NPU参数就表示使能采集NPU侧算子性能数据,默认开启采集。 - - - - - - - - - -
原接口 v2.5.0接口
-  Profiler(op_time=True)
-  
-
-  Profiler(activaties=[mindspore.profiler.ProfilerActivity.NPU])
-  
-
- -- 接口名称:mindspore.Profiler - - 变更内容:aicore_metrics的参数类型由int改为mindspore.profiler.AicoreMetrics枚举值。 - - 说明:aicore_metrics默认值为mindspore.profiler.AicoreMetric.AiCoreNone。 - - - - - - - - - -
原接口 v2.5.0接口
-  Profiler(aicore_metrics=0)
-  
-
-  Profiler(aicore_metrics=mindspore.profiler.AicoreMetric.AiCoreNone)
-  
-
- -- 接口名称:mindspore.Profiler - - 变更内容:废弃profile_framework参数,通过设置activaties=[mindspore.profiler.ProfilerActivity.CPU]采集框架测数据。 - - 说明:activaties参数类型为列表,只要包含mindspore.profiler.ProfilerActivity.CPU参数就表示使能采集框架性能数据,默认开启采集。 - - - - - - - - - -
原接口 v2.5.0接口
-  Profiler(profile_framework="all")
-  
-
-  Profiler(activaties=[mindspore.profiler.ProfilerActivity.CPU])
-  
-
- -### 贡献者 - -baishanyang ,bantao ,Bellatan ,biangelin ,BigSkySea ,caifubi ,candanzg ,candyhong ,Carey ,cccc1111 ,chaijinwei ,changzherui ,chengbin ,chengfeng27 ,chengxb7532 ,chujinjin ,coder2237 ,czrz ,dairenjie ,DavidFFFan ,DeshiChen ,dingjinshan ,ehaleva ,Erpim ,fary86 ,fengyixing ,ffmh ,fuchao ,fuhouyu ,gaoyong10 ,geyuhong ,guoyuzhe ,GuoZhibin ,guozhijian ,halo ,hangq ,haozhang ,hedongdong ,hehongzhe ,hhz886 ,HighCloud ,huangbingjian ,HuangLe02 ,huangziling ,huda ,Huilan Li ,hujiahui8 ,jiahaochen666 ,jiangchao_j ,jiangchenglin3 ,jiangshanfeng ,jiaorui ,jiaxueyu ,jizewei ,jjfeing ,JoeyLin ,jshawjc ,kakyo82 ,kingxian ,kisnwang ,leida ,liangchenghui ,lianghongrui ,LiangZhibo ,lichen ,limingqi107 ,LINH ,linux ,lionelchang ,lishanni ,liubuyu ,liujunzhu ,liuluobin ,liuxu ,liuyanwei ,liyan2022 ,LLLRT ,looop5 ,luochao60 ,luoxuewei ,luoyang ,lyk ,machenggui ,maoyuanpeng1 ,Margaret_wangrui ,master,mengxian ,MengXiangyu ,mengyuanli ,Mrtutu ,mylinchi ,NaCN ,Nikanuo ,niujunhao ,panzhihui ,pengqi ,PingqiLi ,pipecat ,qiuleilei ,qiuyufeng ,qiuzhongya ,r1chardf1d0 ,shaoshengqi ,shen_haochen ,shenhaojing ,shenwei41 ,shilishan ,shiro-zzz ,shuqian0 ,St.Universe ,stavewu ,superxf ,suteng ,TAJh ,tanghuikang ,tangmengcheng ,tan-wei-cheng ,tianxiaodong ,TuDouNi ,TYWZ22259 ,user_0145 ,VectorSL ,vincen45 ,wang_ziqi ,wangshaocong ,wangwensheng4 ,weiyang ,wtcheng ,wtobill ,wujiangming ,wujueying ,wuweikang ,wwwbby ,XianglongZeng ,xiaopeng ,xiaotianci ,xiaoyao ,xiedejin1 ,XinDu ,xuxinglei ,yang guodong ,yangben ,yanghaoran ,yanglong ,yanx ,Yanzhi_YI ,yao_yf ,yefeng ,Yi_zhang95 ,yide12 ,yihangchen ,YijieChen ,YingtongHu ,ylw ,yonibaehr ,yuanqi ,yuchaojie ,yuezenglin ,YuJianfeng ,yyuse ,Zhang QR ,zhangbuxue ,zhangdanyang ,zhanghaibo ,zhangminli ,zhangyinxia ,ZhangZGC ,zhangzhen ,zhengzuohe ,zhouyaqiang0 ,zhuguodong ,zichun_ye ,zlq2020 ,zong_shuai ,ZPaC ,zyli2020 ,陈一 ,程超 ,冯一航 ,胡彬 ,宦晓玲 ,黄勇 ,简云超 ,康伟 ,李栋 ,李良灿 ,李林杰 ,李寅杰,刘崇鸣 ,刘力力 ,刘思铭 ,刘涛Liu ,刘勇琪 ,刘子涵 ,吕浩宇 ,吕凯盟 ,梅飞要 ,倪轩 ,任新 ,十一雷 ,孙昊辰 ,王禹程 ,王振邦 ,熊攀 ,俞涵 ,虞良斌 ,张栩浩 ,赵文璇 ,周莉莉 ,周一航 ,邹文祥 - -## MindSpore 2.4.1 Release Notes - -### 主要特性及增强 - -#### AutoParallel - -- [STABLE] 支持split/concat分支通信计算并行,用户通过切分输入数据,形成可并行分支,分支间自动进行通信计算并行,降低通信开销。 -- [STABLE] 支持Sequence pipeline,配套MindFormers的dev分支的LLama系列模型,通过引入Sequence维度拆分,降低流水线并行的Bubble以及显存开销。 - -#### PyNative - -- [STABLE] PyNative模式通信算子默认按照通信域分配流,支持通信算子并发执行,协同并行策略优化,提供细粒度的通信掩盖,提升模型性能。 - -### 问题修复 - -- [IB0R4N](https://gitee.com/mindspore/mindspore/issues/IB0R4N):修复在某些切分策略下,加载分布式权重精度不对的问题。 - -### 贡献者 - -bantao;caifubi;candanzg;chaijinwei;changzherui;chengbin;chujinjin;DeshiChen;dingjinshan;fary86;fuhouyu;gaoyong10;GuoZhibin;halo;haozhang;hedongdong;huangbingjian;hujiahui8;huoxinyou;jiangshanfeng;jiaorui;jiaxueyu;jshawjc;kisnwang;lichen;limingqi107;liubuyu;looop5;luochao60;luoyang;machenggui;MengXiangyu;Mrtutu;NaCN;panzhihui;qiuzhongya;shenhaojing;shilishan;tanghuikang;TuDouNi;wang_ziqi;weiyang;wujueying;XianglongZeng;xuxinglei;yang guodong;yanghaoran;yao_yf;yide12;yihangchen;YijieChen;YingtongHu;yuchaojie;YuJianfeng;zhangdanyang;ZhangZGC;zhengzuohe;zong_shuai;ZPaC;冯一航;胡彬;宦晓玲;李林杰;刘崇鸣;刘勇琪;任新;王禹程;王振邦;熊攀;俞涵;张栩浩;周一航; - -## MindSpore 2.4.0 Release Notes - -### 主要特性及增强 - -#### Dataset - -- [STABLE] 修改 [mindspore.dataset.GeneratorDataset](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/dataset/mindspore.dataset.GeneratorDataset.html)、[mindspore.dataset.Dataset.map](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/dataset/dataset_method/operation/mindspore.dataset.Dataset.map.html)及 [mindspore.dataset.Dataset.batch](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/dataset/dataset_method/batch/mindspore.dataset.Dataset.batch.html)接口中 `max_rowsize` 参数的默认值为None,以默认开启共享内存的动态分配,此时共享内存将随输入数据实时申请并加速数据处理,用户无需再事先调整该参数大小。 -- [BETA] 数据处理支持独立进程模式,此模式下将减少训练进程与数据读取进程的GIL锁冲突,以提升动态图模式下的性能。可以通过环境变量 `MS_INDEPENDENT_DATASET`启动或关闭此模式。 - -#### Ascend - -- [STABLE] 自定义算子支持昇腾动态图场景Pyboost执行模式,降低了算子调用开销。 -- [STABLE] 昇腾Print算子支持输出超大tensor或print调用密集的场景,用户可以通过`MS_DUMP_SLICE_SIZE`和`MS_DUMP_WAIT_TIME`环境变量指定切片大小和超时时间以支持不同场景。 -- [STABLE] 统一确定性计算设置,用户可以通过仅设置 `mindspore.set_context(deterministic="ON")`来使能昇腾确定性计算。 -- [STABLE] 支持集合通信异常监控,监测到通信异常后,快速退出训练,避免超时等待。 -- [STABLE] 支持[亚健康设备优雅退出](https://www.mindspore.cn/docs/zh-CN/r2.4.0/model_train/train_availability/graceful_exit.html)功能。训练框架检测到集群存在亚健康设备配置信息时,保存CKPT并统一结束集群训练进程。 - -#### Runtime - -- [STABLE] O0/O1模式下支持后端编译缓存,前端编译缓存开启时默认开启。 -- [STABLE] O0/O1模式下支持 aclnnAllGatherMatmul、aclnnMatmulReduceScatter 和 aclnnMatmulAllReduce 算子,提升性能。 -- [STABLE] O0/O1模式下支持通过export MS_DISABLE_HEARTBEAT=1关闭集群心跳配置,降低Scheduler负载。 -- [STABLE] O0/O1模式下支持通信算子融合。 -- [STABLE] O2模式下支持虚拟内存,支持碎片整理功能,Ascend后端默认使能。 -- [STABLE] 设备内存占用动态申请,支持单卡多用户使用,Ascend后端默认使能。 -- [STABLE] O1模式下优化图算融合编译性能,默认使能。 -- [STABLE] O1模式下支持kernel packet融合优化,提升动态shape网络执行性能,默认使能。 -- [BETA] O1模式下支持MatMul后向融合(epilogue fuse)Elementwise算子。通过`mindspore.set_context(graph_kernel_flags="--enable_cluster_ops=MatMul")`使能。 -- [BETA] O1模式下支持用户控制图算融合优化范围,用户通过graph_kernel_flags的enable_pass/disable_pass选项控制打开或者关闭对应融合算子。 - -#### PyNative - -- [STABLE] [mindspore.nn.Cell.register_backward_hook](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.register_backward_hook)/[mindspore.nn.Cell.register_forward_hook](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.register_forward_hook)对应的hook函数入参cell_id变更为cell的python对象。 -- [STABLE] 新增[Cell.register_backward_pre_hook](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.register_backward_pre_hook)接口,该API在Cell上注册反向传播的钩子函数,当每次计算完成该Cell的梯度时都会调用该钩子函数。 -- [STABLE] 优化PyNative流程AICPU类算子下发缓存,提升API执行性能。 -- [STABLE] 新增动态图下将一组Tensor占用的设备内存转换为一块连续的内存功能。 - -#### FrontEnd - -- [STABLE] 在故障恢复场景,支持权重去冗余保存和加载。 -- [STABLE] 混合精度训练,支持[auto模式](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/amp/mindspore.amp.auto_mixed_precision.html#mindspore.amp.auto_mixed_precision)。 -- [STABLE] 支持对safetensors格式的保存、加载,以及并行场景下基于safetensors的离线汇聚和分布式加载。 -- [BETA] 新增循环大算子接口 [mindspore.ops.WhileLoop](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/ops/mindspore.ops.WhileLoop.html)、[mindspore.ops.ForiLoop](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/ops/mindspore.ops.ForiLoop.html)、[mindspore.ops.Scan](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/ops/mindspore.ops.Scan.html),优化循环编译时间。 -- [BETA] 图模式下支持算子传入关键字参数。 - -#### Parallel - -- [STABLE] [mindspore.ops.TensorDump](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/ops/mindspore.ops.TensorDump.html)算子支持分布式并行的场景,用户可通过配置TensorDump算子的 `input_output`属性决定打印输入/输出分片;新增接口[mindspore.ops.tensordump](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/ops/mindspore.ops.tensordump.html)。 -- [STABLE] msrun支持根据传入的rank table file来自定义rank id,支持通过 `--rank_table_file`传入的json文件来重排rank id。 -- [STABLE] 支持昇腾单机内高性能通信库LCCL,用户可通过 `MS_ENABLE_LCCL` 环境变量在昇腾后端训练场景下使能LCCL通信库。 -- [STABLE] 策略传播算法适配LLaMA/Mixtral类网络,减少用户配置LLaMA/Mixtral网络时切分策略的工作量。 -- [STABLE] 支持高维张量并行,用户可通过配置[mindspore.ops.MatMul](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/ops/mindspore.ops.MatMul.html)/[mindspore.ops.BatchMatMul](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/ops/mindspore.ops.BatchMatMul.html)算子的input_layout切换1D/2D/3D张量切分模式。 -- [STABLE] 模拟编译在SIMULATION_LEVEL=0和SIMULATION_LEVEL=1运行方式jit_level为O0/O1时,不占用硬件资源。 -- [STABLE] BatchMatMul模型并行引入的Allreduce在后续跟切分操作时,如果在parallel_speed_up_json中开启enable_allreduce_slice_to_reducescatter,根据匹配规则,自动化转换为ReduceScatter以减少通信量。 -- [STABLE] [mindspore.nn.Cell.shard](https://www.mindspore.cn/docs/zh-CN/master/api_python/nn/mindspore.nn.Cell.html#mindspore.nn.Cell.shard)和[mindspore.shard](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/mindspore/mindspore.shard.html)支持用户配置mindspore.Layout类型的策略及各参数的切分策略parameter_plan。 -- [BETA] SAPP支持在手工预配置算子并行切分策略后全自动生成剩余算子策略。用户通过打开 `MS_INTERFERED_SAPP` 环境变量来激活 `.shard()` 预配置的并行切分策略。 -- [BETA] [mindspore.ops.Custom](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/ops/mindspore.ops.Custom.html)算子支持配置切分策略。 - -#### Inference - -- [STABLE] 新增Qwen2和LLaMA3.1系列大模型支持训推一体架构,实现脚本、分布式策略和运行时的统一,通过融合大算子降低推理时延,有效提升网络吞吐量。 -- [STABLE] 支持并行解码服务化部署,实现LLaMA系列大模型LookAhead投机推理。 -- [BETA] 支持SLoRA服务化部署,实现大模型多微调权重调度推理。 - -#### Dump - -- [STABLE] 优化[Dump文档](https://www.mindspore.cn/docs/zh-CN/r2.4.0/model_train/debug/dump.html),按照设备类型和优化等级划分使用方式。 -- [STABLE] Ascend O0/O1模式下支持异步Dump,包括异步Tensor、溢出、统计信息(host和device模式)。 -- [STABLE] 溢出Dump支持配置最大溢出个数。 -- [STABLE] Ascend O2模式下支持set dump。 -- [STABLE] 支持qint4x2量化类型Dump。 - -### API 变更 - -#### 新增API - -- [STABLE] [mindspore.mint](https://www.mindspore.cn/docs/zh-CN/r2.4.0/api_python/mindspore.mint.html) API新增了大量的functional、nn接口。mint接口当前是实验性接口,在图编译模式为O0和PyNative模式下性能比ops更优。当前暂不支持图下沉模式及CPU、GPU后端,后续会逐步完善。 - - | mindspore.mint | | | | - | :------------------------- | :------------------------------- | :--------------------------- | :------------------------- | - | mindspore.mint.full | mindspore.mint.repeat_interleave | mindspore.mint.linspace | mindspore.mint.scatter | - | mindspore.mint.tril | mindspore.mint.argmin | mindspore.mint.sign | mindspore.mint.remainder | - | mindspore.mint.flatten | mindspore.mint.asin | mindspore.mint.arcsin | mindspore.mint.sinh | - | mindspore.mint.arcsinh | mindspore.mint.atan | mindspore.mint.arctan | mindspore.mint.atanh | - | mindspore.mint.arctanh | mindspore.mint.acos | mindspore.mint.arccos | mindspore.mint.acosh | - | mindspore.mint.arccosh | mindspore.mint.erfc | mindspore.mint.expm1 | mindspore.mint.log1p | - | mindspore.mint.logical_xor | mindspore.mint.round | mindspore.mint.tan | mindspore.mint.trace | - | mindspore.mint.trunc | mindspore.mint.cross | mindspore.mint.masked_select | mindspore.mint.bitwise_and | - | mindspore.mint.bitwise_or | mindspore.mint.bitwise_xor | mindspore.mint.cosh | mindspore.mint.cummax | - | mindspore.mint.cummin | mindspore.mint.median | mindspore.mint.roll | mindspore.mint.sinc | - | mindspore.mint.sinh | mindspore.mint.xlogy | | | - - | mindspore.mint.nn | - | :---------------------------- | - | mindspore.mint.nn.ReLU | - | mindspore.mint.nn.Hardsigmoid | - | mindspore.mint.nn.AvgPool2d | - | mindspore.mint.nn.MSELoss | - | mindspore.mint.nn.LogSoftmax | - | mindspore.mint.nn.Mish | - | mindspore.mint.nn.PReLU | - | mindspore.mint.nn.SELU | - | mindspore.mint.nn.Softshrink | - | mindspore.mint.nn.Hardshrink | - | mindspore.mint.nn.Hardswish | - | mindspore.mint.nn.L1Loss | - - | mindspore.mint.nn.functional | - | :--------------------------------------- | - | mindspore.mint.nn.functional.hardsigmoid | - | mindspore.mint.nn.functional.log_softmax | - | mindspore.mint.nn.functional.mish | - | mindspore.mint.nn.functional.prelu | - | mindspore.mint.nn.functional.selu | - | mindspore.mint.nn.functional.softshrink | - | mindspore.mint.nn.functional.hardshrink | - | mindspore.mint.nn.functional.hardswish | - | mindspore.mint.nn.functional.l1_loss | - | | - -#### 接口变更 - -- 接口名称:mindspore.dataset.GeneratorDataset - - 变更内容:参数 `max_rowsize`默认值从 `6`变更为 `None`,以默认开启共享内存动态分配。 - - - - - - - - - -
原接口 v2.4.0接口
-  class GeneratorDataset(source,
-                         column_names=None,
-                         column_types=None,
-                         schema=None,
-                         num_samples=None,
-                         num_parallel_workers=1,
-                         shuffle=None,
-                         sampler=None,
-                         num_shards=None,
-                         shard_id=None,
-                         python_multiprocessing=True,
-                         max_rowsize=6)
-  
-
-  class GeneratorDataset(source,
-                         column_names=None,
-                         column_types=None,
-                         schema=None,
-                         num_samples=None,
-                         num_parallel_workers=1,
-                         shuffle=None,
-                         sampler=None,
-                         num_shards=None,
-                         shard_id=None,
-                         python_multiprocessing=True,
-                         max_rowsize=None)
-  
-
- -- 接口名称:mindspore.dataset.Dataset.batch - - 变更内容:参数 `max_rowsize`默认值从 `16`变更为 `None`,以默认开启共享内存动态分配。 - - - - - - - - - -
原接口 v2.4.0接口
-  def batch(input_dataset,
-            batch_size,
-            drop_remainder=False,
-            num_parallel_workers=None,
-            per_batch_map=None,
-            input_columns=None,
-            output_columns=None,
-            python_multiprocessing=False,
-            max_rowsize=16)
-  
-
-  def batch(input_dataset,
-            batch_size,
-            drop_remainder=False,
-            num_parallel_workers=None,
-            per_batch_map=None,
-            input_columns=None,
-            output_columns=None,
-            python_multiprocessing=False,
-            max_rowsize=None)
-  
-
- -- 接口名称:mindspore.dataset.Dataset.map - - 变更内容:参数 `max_rowsize`默认值从 `16`变更为 `None`,以默认开启共享内存动态分配。 - - - - - - - - - -
原接口 v2.4.0接口
-  def map(input_dataset,
-          operations=None,
-          input_columns=None,
-          output_columns=None,
-          num_parallel_workers=None,
-          python_multiprocessing=False,
-          cache=None,
-          callbacks=None,
-          max_rowsize=16, offload=None)
-  
-
-  def map(input_dataset,
-          operations=None,
-          input_columns=None,
-          output_columns=None,
-          num_parallel_workers=None,
-          python_multiprocessing=False,
-          cache=None,
-          callbacks=None,
-          max_rowsize=None, offload=None)
-  
-
- -- 接口名称:mindspore.ops.TensorDump - - 变更内容:新增参数 `input_output`,控制打印行为。 - - - - - - - - - -
原接口 v2.4.0接口
-  class TensorDump()
-  
-
-  class TensorDump(input_output='out')
-  
-
- -- 接口名称:MindSpore Dump Tensor保存的文件格式 - - 变更内容:Dump得到的npy文件,会将原始Tensor的dtype信息添加到文件名中。 - - - - - - - - - -
原接口 v2.4.0接口
-  {op_type}.{op_name}.{task_id}.{stream_id}.
-  {timestamp}.{input_output_index}.{slot}.
-  {format}.npy
-  
-
-  {op_type}.{op_name}.{task_id}.{stream_id}.
-  {timestamp}.{input_output_index}.{slot}.
-  {format}.{dtype}.npy
-  
-
- -#### 非兼容性接口变更 - -- 接口名称:mindspore.nn.Cell.register_backward_hook(hook_fn) - - 变更内容:hook_fn的入参由cell_id变更为cell对象。 - - 说明:对原有hook,可以在hook_fn中通过id(cell)获取原有的cell_id。 - - - - - - - - - -
原接口 v2.4.0接口
-  def register_backward_hook(hook_fn)
-  入参hook_fn(cell_id,
-             grad_input, grad_output)
-             -> New grad_output or None
-  
-
-  def register_backward_hook(hook_fn)
-  入参hook_fn(cell,
-             grad_input, grad_output)
-             -> New grad_input or None
-  
-
- -- 接口名称:mindspore.nn.Cell.register_forward_hook(hook_fn) - - 变更内容:hook_fn的入参由cell_id变更为cell对象。 - - 说明:对原有hook,可以在hook_fn中通过id(cell)获取原有的cell_id。 - - - - - - - - - -
原接口 v2.4.0接口
-  def register_forward_hook(hook_fn)
-  入参hook_fn(cell_id, inputs, outputs)-> New outputs or None
-  
-
-  def register_forward_hook(hook_fn)
-  入参hook_fn(cell, inputs, outputs)-> New outputs or None
-  
-
- -- 接口名称:mindspore.communication.comm_func.all_reduce - - 变更内容:all_reduce新增入参async_op,返回值从Tensor变更为Tensor和CommHandle组成的tuple。 - - 说明:async_op表示all_reduce是否开启多流并行,默认值是False。 - - - - - - - - - -
原接口 v2.4.0接口
-  def all_reduce(tensor,
-                 op=ReduceOp.SUM,
-                 group=GlobalComm.WORLD_COMM_GROUP)->Tensor
-  
-
-  def all_reduce(tensor,
-                 op=ReduceOp.SUM,
-                 group=GlobalComm.WORLD_COMM_GROUP,
-                 async_op=False)
-                 ->tuple(Tensor, CommHandle)
-  
-
- -- 接口名称:mindspore.communication.comm_func.all_gather_into_tensor - - 变更内容:all_reduce新增入参async_op,返回值从Tensor变更为Tensor和CommHandle组成的tuple。 - - 说明:async_op表示all_gather_into_tensor是否开启多流并行,默认值是False。 - - - - - - - - - -
原接口 v2.4.0接口
-  def all_gather_into_tensor(tensor,
-                             group=GlobalComm.
-                             WORLD_COMM_GROUP)->Tensor
-  
-
-  def all_gather_into_tensor(tensor,
-                             group=GlobalComm.
-                             WORLD_COMM_GROUP,
-                             async_op=False)->
-                             tuple(Tensor, CommHandle)
-  
-
- -- 接口名称:mindspore.communication.comm_func.reduce_scatter_tensor - - 变更内容:all_reduce新增入参async_op,返回值从Tensor变更为Tensor和CommHandle组成的tuple。 - - 说明:async_op表示reduce_scatter_tensor是否开启多流并行,默认值是False。 - - - - - - - - - -
原接口 v2.4.0接口
-  def reduce_scatter_tensor(tensor,
-                            op=ReduceOp.SUM,
-                            group=GlobalComm.
-                            WORLD_COMM_GROUP)->Tensor
-  
-
-  def reduce_scatter_tensor(tensor,
-                            op=ReduceOp.SUM,
-                            group=GlobalComm.WORLD_COMM_GROUP,
-                            async_op=False)->
-                            tuple(Tensor, CommHandle)
-  
-
- -- 接口名称:mindspore.communication.comm_func.isend - - 变更内容:返回值从Tensor变更为Handle。 - - 说明:isend默认开启多流并行。 - - - - - - - - - -
原接口 v2.4.0接口
-  def isend(tensor,
-            dst=0,group=GlobalComm.
-            WORLD_COMM_GROUP, tag=0)->Tensor
-  
-
-  def isend(tensor,
-            dst=0,group=GlobalComm.
-            WORLD_COMM_GROUP, tag=0)->CommHandle
-  
-
- -- 接口名称:mindspore.communication.comm_func.irecv - - 变更内容:返回值从Tensor变更为Handle。 - - 说明:irecv默认开启多流并行。 - - - - - - - - - -
原接口 v2.4.0接口
-  def irecv(tensor,
-            src=0, group=GlobalComm.
-            WORLD_COMM_GROUP, tag=0)->Tensor
-  
-
-  def irecv(tensor,
-            src=0,
-            group=GlobalComm.
-            WORLD_COMM_GROUP, tag=0)->CommHandle
-  
-
- -- 接口名称:mindspore.communication.comm_func.all_to_all_with_output_shape - - 变更内容:all_to_all_with_output_shape新增入参async_op,返回值从Tensor变更为Tensor和CommHandle组成的tuple。 - - 说明:async_op表示all_to_all_with_output_shape是否开启多流并行,默认值是False。 - - - - - - - - - -
原接口 v2.4.0接口
-  def all_to_all_with_output_shape(output_shape_list,
-                                   input_tensor_list,
-                                   group=None)->tuple(Tensor)
-  
-
-  def all_to_all_with_output_shape(output_shape_list,
-                                   input_tensor_list,
-                                   group=None,
-                                   async_op=False)->
-                                   tuple(tuple(Tensor),
-                                   CommHandle)
-  
-
- -- 接口名称:mindspore.communication.comm_func.all_to_all_single_with_output_shape - - 变更内容:all_to_all_single_with_output_shape新增入参async_op,返回值从Tensor变更为Tensor和CommHandle组成的tuple。 - - 说明:async_op表示all_to_all_single_with_output_shape是否开启多流并行,默认值是False。 - - - - - - - - - -
原接口 v2.4.0接口
-  def all_to_all_single_with_output_shape(output_shape,
-                                          tensor,
-                                          output_split_sizes=None,
-                                          input_split_sizes=None,
-                                          group=None)->Tensor
-  
-
-  def all_to_all_single_with_output_shape(output_shape,
-                                          tensor,
-                                          output_split_sizes=None,
-                                          input_split_sizes=None,
-                                          group=None,
-                                          async_op=False)->
-                                          tuple(Tensor, CommHandle)
-  
-
- -### 贡献者 - -anyrenwei,bantao,baochong,Bellatan,BJ-WANG,caifubi,candanzg,candyhong,Carey,cccc1111,ccsszz,changzherui,chengbin,chengfeng27,chengxb7532,chenjianping,chenweifeng,chujinjin,dairenjie,DavidFFFan,DeshiChen,dingjinshan,emmmmtang,fanyi20,fary86,fengyixing,fix-dryrun,fuchao,fuhouyu,gaoyong10,gengdongjie,gent1e,GuoZhibin,guozhijian,halo,hangq,haozhang,hedongdong,Henry Shi,HighCloud,Hongxing,huandong1,huangbingjian,HuangLe02,huangziling,huda,huiliang166,hujiahui8,huoxinyou,jiangchenglin3,jianghui58,jiangshanfeng,jiaorui,jiaxueyu,jijiarong,jjfeing,JoeyLin,jshawjc,jxl,kairui_kou,kisnwang,kk,lanzhineng,LiangZhibo,lichen,limingqi107,lionelchang,liubuyu,liujunzhu,liuluobin,liyejun,LLLRT,looop5,luochao60,luoxuewei,luoyang,machenggui,maning202007,maoyuanpeng1,Margaret_wangrui,MengXiangyu,mengyuanli,moran,Mrtutu,mylinchi,NaCN,nomindcarry,panzhihui,paolopoggi,pengqi,pierreleca,qiuleilei,qiuyufeng,qiuzhongya,r1chardf1d0,shaoshengqi,shen_haochen,shenhaojing,shenwei41,shihlCST,shilishan,shiro-zzz,shiziyang,shop-pin,shunyuanhan,shuqian0,stavewu,superxf,suteng,tanghuikang,tangmengcheng,tan-wei-cheng,tan-wei-cheng-3260,tianxiaodong,TronZhang,TuDouNi,VectorSL,vincen45,wang_ziqi,wanghenchang,wangjie,wangshaocong,weiyang,wtobill,wudawei,wujueying,wwwbby,xfan233,XianglongZeng,xiaotianci,xiaoxin_zhang,xiaoxiongzhu,xiaoxuanKL,xiaoyao,XinDu,xuxinglei,xuzhubin,yanghaoran,yanglong,yangzhenzhang,yanx,Yanzhi_YI,yao_yf,yefeng,yide12,yihangchen,YijieChen,YingLai Lin,ylw,yuanpeng2024,yuanqi,yuchaojie,Yuheng Wang,YuJianfeng,YukioZzz,yyuse,zangqx,ZeyuHan,zhangbuxue,zhanghaibo,zhangminli,zhangqinghua,zhangyanhui,ZhangZGC,zhangzhen,zhanzhan,zhengzuohe,zhouyaqiang0,zhuguodong,zichun_ye,zjun,zong_shuai,ZPaC,zuochuanyong,zyli2020,程超,蛋蛋de忧桑,狄新凯,范吉斌,冯一航,付国华,胡彬,宦晓玲,黄勇,黄卓,康伟,李良灿,李林杰,李寅杰3,刘崇鸣,刘思铭,刘涛Liu,刘勇琪,刘子涵,吕浩宇,吕昱峰(Nate.River),钱丹,十一雷,孙昊辰,王禹程,王振邦,王梓润,吴大维,熊攀,徐安越,许子豪,俞涵,云骑士,张峻源,张王泽,张栩浩,赵文璇,周莉莉,朱家兴,邹文祥 - -## MindSpore 2.3.1 Release Notes - -### 主要特性及增强 - -- [STABLE] 去掉了在使用 [Layout](https://www.mindspore.cn/docs/zh-CN/r2.3.1/api_python/mindspore/mindspore.Layout.html#mindspore.Layout) 构建切分策略时,interleaved_parallel 对应的 device_matrix 的值必须为 2 的限制。 -- [STABLE] 增加了用户自定义控制边的环境变量 [MS_CUSTOM_DEPEND_CONFIG_PATH](https://www.mindspore.cn/docs/zh-CN/r2.3.1/note/env_var_list.html),用户可以自行配置控制边以实现更好的通信计算隐藏。 - -### API 变更 - -#### 新增API - -- [STABLE] 新增API[mindspore.mint.repeat_interleave](https://www.mindspore.cn/docs/zh-CN/r2.3.1/api_python/mint/mindspore.mint.repeat_interleave.html)。 - -### 贡献者 - -ccsszz;dairenjie;DeshiChen;fuhouyu;gaoshuanglong;gaoyong10;GuoZhibin;halo;huoxinyou;jiangchao_j;jiaorui;jiaxueyu;jijiarong;JuiceZ;lichen;liujunzhu;liuluobin;LLLRT;looop5;luoyang ;Margaret_wangrui;mengyuanli;panzhihui;pengqi;PingqiLi;Renyuan Zhang;tanghuikang;tianxiaodong;TuDouNi;wudawei;XianglongZeng;xiaosh;xiaoxin_zhang;XinDu;yanghaoran;yanglong;yangruoqi713;Yanzhi_YI;yao_yf;YijieChen;yuchaojie;YuJianfeng;zangqx;zhengzuohe;zhouyaqiang0;ZPaC;zyli2020;胡彬;宦晓玲;康伟;李林杰;刘崇鸣;王禹程;俞涵;周莉莉;邹文祥 - -欢迎以任何形式对项目提供贡献! - -## MindSpore Lite 2.3.1 Release Notes - -### 主要特性及增强 - -昇腾后端模型转换时,支持使用配置文件中的[input_shape 参数](https://www.mindspore.cn/lite/docs/zh-CN/r2.3.1/use/cloud_infer/converter_tool_ascend.html)来指定输入尺寸。 - -### API 变更 - -- [ModelGroup接口](https://www.mindspore.cn/lite/docs/zh-CN/r2.3.1/use/cloud_infer/runtime_cpp.html) 新增模型权重共享支持,节省显存。 -- [Model.get_model_info接口](https://www.mindspore.cn/lite/docs/zh-CN/r2.3.1/use/converter_tool.html?highlight=get_model_info) 新增支持获取模型的输入尺寸。 - -### 贡献者 - -熊攀;ZhangZGC;jxl;zhangyanhui;emmmmtang;huandong1;yefeng - -## MindSpore 2.3.0 Release Notes - -### 主要特性及增强 - -#### AutoParallel - -- [STABLE] 扩展函数式并行能力,[mindspore.shard](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/mindspore/mindspore.shard.html)新增支持图模式,图模式下以nn.Cell/function为单位设置输入与权重的并行切分策略,未设置的算子将通过"sharding_propagation"自动配置并行策略;增加支持手动重排布的[mindspore.reshard](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/mindspore/mindspore.reshard.html)接口,通过[mindspore.Layout](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/mindspore/mindspore.Layout.html)对张量设置精准切分策略。 -- [STABLE] 新增Callback接口[mindspore.train.FlopsUtilizationCollector](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/train/mindspore.train.FlopsUtilizationCollector.html)统计模型算力利用率信息MFU和硬件算力利用率信息HFU 。 -- [STABLE] 新增函数式通信接口[mindspore.communication.comm_func](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/mindspore.communication.comm_func.html)。 -- [BETA] O0和O1模式下,优化interleaved pipeline的内存占用。 -- [BETA] 自动并行模式下支持多机场景自动流水线策略生成(暂不支持单机场景自动流水线策略生成),需要将 `parallel_mode` 设置成自动并行 ``auto_parallel`` 并将 `search_mode` 设置成双递归算法 ``recursive_programming``。 - -#### PyNative - -- [STABLE] 优化动态图的基础数据结构,提升算子API性能。 -- [STABLE] Tensor支持[register_hook](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/mindspore/Tensor/mindspore.Tensor.register_hook.html)功能,以便用户打印或者修改Tensor对应的梯度。 -- [STABLE] PyNative模式支持重计算功能,用户可以通过重计算接口降低网络的显存峰值。 - -#### FrontEnd - -- [STABLE] 优化Checkpoint保存、加载基础流程,提升性能20%。 -- [STABLE] 支持在保存、加载过程中对Checkpoint文件进行CRC校验,提升安全性。 - -#### Dataset - -- [STABLE] 为以下数据增强增加昇腾处理后端支持:Equalize、Rotate、AutoContrast、Posterize、AdjustSharpness、Invert、Solarize、ConvertColor、Erase。 -- [STABLE] 增加视频文件读取、解析功能支持,详见API:[mindspore.dataset.vision.DecodeVideo](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/dataset_vision/mindspore.dataset.vision.DecodeVideo.html)、[mindspore.dataset.vision.read_video](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/dataset_vision/mindspore.dataset.vision.read_video.html#mindspore.dataset.vision.read_video)、[mindspore.dataset.vision.read_video_timestamps](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/dataset_vision/mindspore.dataset.vision.read_video_timestamps.html#mindspore.dataset.vision.read_video_timestamps)。 -- [STABLE] 支持在 `mindspore.dataset.GeneratorDataset`、`mindspore.dataset.Dataset.map` 及 `mindspore.dataset.Dataset.batch` 接口中指定 `max_rowsize` 参数为-1,此时数据多进程所使用的共享内存将随数据大小动态分配并高效运行,无需手动调参。 - -#### Inference - -- [STABLE] 新增LLaMa2、LLaMa3、Qwen1.5等14个大模型支持训推一体架构,实现脚本、分布式策略和运行时的统一,典型大模型训练到推理部署周期下降到天级,通过融合大算子降低推理时延,有效提升网络吞吐量。 - -#### PIJIT - -- [BETA] 支持Python 3.8和Python 3.10的字节码解析,扩大Python版本的支持范围。 -- [BETA] 支持Dynamic Shape、Symbolic Shape作为输入,使能动态输入场景。 -- [BETA] 使能单步构图能力,优化编译时间。 -- [BETA] 通过调整字节码支持了带有副作用的字节码被捕获(STORE_ATTR、STORE_GLOBAL、LIST_APPEND、dict.pop),使能自动混合精度,减少裂图,提升性能。 - -#### Profiler - -- [STABLE] 提供分级Profiler功能,通过profiler_level参数可控制按照不同级别进行性能数据采集。 -- [STABLE] Profiler analyse方法新增mode参数,可配置异步解析模式,性能数据解析与训练并行。 -- [STABLE] Profiler接口新增data_simplification参数,用户可控制性能数据解析完成后是否删除多余数据,节省硬盘空间。 -- [STABLE] Profiler接口增强内存分析功能,用户通过profile_memory参数可采集框架、CANN、硬件的内存申请、释放信息,并可通过[MindStudio工具](https://www.hiascend.com/developer/blog/details/0230130822583032044)进行可视化分析。 -- [BETA] PyNative模式下Timeline整合host profiling信息,包括任务耗时、用户侧堆栈信息。 - -#### Dump - -- [STABLE] 增强同步和异步Dump功能,统计信息Dump新增L2Norm信息、新增statistic_category字段支持用户自定义需要保存的统计信息,提高Dump易用性。同步和异步Dump支持情况可参考[Dump功能说明](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.0/debug/dump.html#dump功能说明)。 -- [STABLE] 完善同步Dump功能,通过配置op_debug_mode字段使能溢出和异常Dump。 -- [STABLE] 增强同步Dump功能,通过配置stat_calc_mode字段可以使能device计算统计信息(默认在host计算),通过配置sample_mode字段可以进行采样Dump,提升Dump性能。 -- [STABLE] 增强异步Dump功能,支持保存complex64和complex128格式。 - -#### Runtime - -- [STABLE] 支持静态图多级编译,配置为[mindspore.set_context(jit_config={"jit_level": "O0/O1/O2"})](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/mindspore/mindspore.set_context.html),默认值为空,框架根据产品类别自动选择优化级别,Atlas训练产品为O2,其余产品均为O0。 -- [STABLE] 静态图O0/O1下支持通信计算多流并发执行。 -- [STABLE] 新增[内存管理接口](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/mindspore.hal.html#内存管理)。 -- [BETA] 内存池支持虚拟内存碎片整理,在静态图O0/O1下默认使能虚拟内存。 - -#### Ascend - -- [STABLE] 提供昇腾平台上算子内存越界访问检测开关,用户可以通过设置 `mindspore.set_context(ascend_config={"op_debug_option": "oom"})`来检测昇腾平台上算子内部内存越界问题。 -- [BETA] 环境变量[MS_SIMULATION_LEVEL](https://www.mindspore.cn/docs/zh-CN/r2.3.0/note/env_var_list.html)在昇腾平台上新增支持图编译O0执行模式,并可支持编译性能和运行时内存分析。 -- [BETA] 昇腾平台支持通过AOT接入使用[AscendC自定义算子](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.0/operation/op_custom_ascendc.html)。 - -### API变更 - -#### 新增API - -- [STABLE] 新增[mindspore.mint](https://www.mindspore.cn/docs/zh-CN/r2.3.0/api_python/mindspore.mint.html)API,提供了大量的functional、nn、优化器接口,API用法及功能等与业界主流用法一致,方便用户参考使用。mint接口当前是实验性接口,在图编译模式为O0和PyNative模式下性能比ops更优。当前暂不支持图下沉模式及CPU、GPU后端,后续会逐步完善。 - - | mindspore.mint | | | | - |:----|:----|:----|:----| - | mindspore.mint.eye |mindspore.mint.rand_like|mindspore.mint.isfinite|mindspore.mint.any| - | mindspore.mint.ones |mindspore.mint.rand|mindspore.mint.log|mindspore.mint.greater_equal| - | mindspore.mint.ones_like |mindspore.mint.gather|mindspore.mint.logical_and|mindspore.mint.all| - | mindspore.mint.zeros |mindspore.mint.permute|mindspore.mint.logical_not|mindspore.mint.mean| - | mindspore.mint.zeros_like |mindspore.mint.repeat_interleave|mindspore.mint.logical_or|mindspore.mint.prod| - | mindspore.mint.arange |mindspore.mint.abs|mindspore.mint.mul|mindspore.mint.sum| - | mindspore.mint.broadcast_to |mindspore.mint.add|mindspore.mint.neg|mindspore.mint.eq| - | mindspore.mint.cat |mindspore.mint.clamp|mindspore.mint.negative|mindspore.mint.ne| - | mindspore.mint.index_select |mindspore.mint.cumsum|mindspore.mint.pow|mindspore.mint.greater| - | mindspore.mint.max |mindspore.mint.atan2|mindspore.mint.reciprocal|mindspore.mint.gt| - | mindspore.mint.min |mindspore.mint.arctan2|mindspore.mint.rsqrt|mindspore.mint.isclose| - | mindspore.mint.scatter_add |mindspore.mint.ceil|mindspore.mint.sigmoid|mindspore.mint.le| - | mindspore.mint.narrow |mindspore.mint.unique|mindspore.mint.sin|mindspore.mint.less_equal| - | mindspore.mint.nonzero |mindspore.mint.div|mindspore.mint.sqrt|mindspore.mint.lt| - | mindspore.mint.normal |mindspore.mint.divide|mindspore.mint.square|mindspore.mint.maximum| - | mindspore.mint.tile |mindspore.mint.erf|mindspore.mint.sub|mindspore.mint.minimum| - | mindspore.mint.topk |mindspore.mint.erfinv|mindspore.mint.tanh|mindspore.mint.inverse| - | mindspore.mint.sort |mindspore.mint.exp|mindspore.mint.bmm|mindspore.mint.searchsorted| - | mindspore.mint.stack |mindspore.mint.floor|mindspore.mint.matmul|mindspore.mint.argmax| - | mindspore.mint.where |mindspore.mint.flip|mindspore.mint.split|mindspore.mint.cos| - | mindspore.mint.less ||| - - | mindspore.mint.nn| - |:----| - | mindspore.mint.nn.Dropout | - | mindspore.mint.nn.Unfold | - | mindspore.mint.nn.Fold | - | mindspore.mint.nn.Linear| - | mindspore.mint.nn.BCEWithLogitsLoss | - - | mindspore.mint.nn.functional|| - |:----|:----| - |mindspore.mint.nn.functional.batch_norm |mindspore.mint.nn.functional.group_norm| - |mindspore.mint.nn.functional.fold |mindspore.mint.nn.functional.layer_norm| - |mindspore.mint.nn.functional.max_pool2d |mindspore.mint.nn.functional.linear| - |mindspore.mint.nn.functional.binary_cross_entropy |mindspore.mint.nn.functional.unfold| - |mindspore.mint.nn.functional.sigmoid |mindspore.mint.nn.functional.one_hot| - |mindspore.mint.nn.functional.tanh |mindspore.mint.nn.functional.elu| - |mindspore.mint.nn.functional.binary_cross_entropy_with_logits |mindspore.mint.nn.functional.gelu| - |mindspore.mint.nn.functional.dropout|mindspore.mint.nn.functional.leaky_relu| - |mindspore.mint.nn.functional.embedding |mindspore.mint.nn.functional.silu| - |mindspore.mint.nn.functional.grid_sample|mindspore.mint.nn.functional.softplus| - |mindspore.mint.nn.functional.relu|mindspore.mint.nn.functional.softmax| - |mindspore.mint.nn.functional.pad|| - - | mindspore.mint.optim | - |:----| - | mindspore.mint.optim.AdamW | - - | mindspore.mint.linalg | - |:----| - | mindspore.mint.linalg.inv | - -### 非兼容性接口变更 - -- 接口名称:性能数据采集接口 `Profiler` - - 变更内容:解析生成的性能数据文件进行了精简,将在导出性能数据后删除FRAMEWORK目录数据以及其他多余数据,仅保留profiler的交付件以及PROF_XXX目录下的原始性能数据,以节省空间。通过将 `data_simplification`参数配置为 `False`可关闭精简模式,与历史版本生成的性能数据文件保持一致。 -- 接口名称:Dump功能配置文件中的 `saved_data` 字段为 `"tensor"`。 - - 变更内容:Dump落盘的文件名发生变更,`"/"`用 `"_"`代替,算子名称变为算子全局名称。 - - - - - - - - - -
原文件名 2.3文件名
-  文件名格式:
-  {op_type}.{op_name}.{task_id}.{stream_id}.
-  {timestamp}.{input_output_index}.{slot}.{format}.npy
-  
- 示例: - Conv2D.Conv2D-op12.0.0.1623124369613540. - output.0.DefaultFormat.npy -
-
-  文件名格式:
-  {op_type}.{op_name}.{task_id}.{stream_id}.
-  {timestamp}.{input_output_index}.{slot}.{format}.npy
-  
- 示例: - Conv2D.Default_network-WithLossCell__backbone-AlexNet_conv3 - -Conv2d_Conv2D-op12.0.0.1623124369613540.output.0.DefaultFormat.npy -
-
-- 接口名称:Dump功能配置文件中的 `saved_data`字段为 `"statistic"`。 - - 变更内容:原默认保存 `"max"`、`"min"`、`"avg"`、`"count"`、`"negative zero count"`、`"positive zero count"`、`"nan count"`、`"negative inf count"`、`"positive inf count"`、`"zero count"`、`md5`统计项,2.3变更为默认保存 `"max"`、`"min"`、`"l2norm"`统计项,可以通过配置 `statistic_category`自定义统计项。 - -### 贡献者 - -caifubi;candanzg;ccsszz;chaiyouheng;changzherui;chenfei_mindspore;chengbin;chengfeng27;Chong;dairenjie;DavidFFFan;DeshiChen;dingjinshan;douzhixing;emmmmtang;Erpim;fary86;fengyixing;fuhouyu;gaoyong10;GuoZhibin;guozhijian;halo;haozhang;hejianheng;Henry Shi;horcham;huandong1;huangbingjian;Jackson_Wong;jiangchenglin3;jiangshanfeng;jiangzhenguang;jiaorui;bantao;jiaxueyu;jijiarong;JuiceZ;jxl;kairui_kou;lanzhineng;LiangZhibo;lichen;limingqi107;linqingke;liubuyu;liujunzhu;liuluobin;liyan2022;liyejun;LLLRT;looop5;lujiale;luochao60;luoyang;lvxudong;machenggui;maning202007;Margaret_wangrui;master_2;mengyuanli;moran;Mrtutu;NaCN;nomindcarry;panzhihui;pengqi;qiuyufeng;qiuzhongya;Renyuan Zhang;shaoshengqi;Shawny;shen_haochen;shenhaojing;shenwei41;shij1anhan;shilishan;shiziyang;shunyuanhan;shuqian0;TAJh;tanghuikang;tan-wei-cheng;Thibaut;tianxiaodong;TronZhang;TuDouNi;VectorSL;wang_ziqi;wanghenchang;wangjie;weiyang;wudawei;wujiangming;wujueying;XianglongZeng;xiaotianci;xiaoxin_zhang;xiaoxiongzhu;xiaoyao;XinDu;xuxinglei;yangchen;yanghaoran;yanglong;yangruoqi713;yangzhenzhang;yangzishuo;Yanzhi_YI;yao_yf;yefeng;yide12;YijieChen;YingLai Lin;yuchaojie;YuJianfeng;zangqx;zhaiyukun;zhangminli;zhangqinghua;ZhangZGC;zhengxinQian;zhengzuohe;zhouyaqiang0;zhuguodong;zhupuxu;zichun_ye;zjun;zlq2020;ZPaC;zuochuanyong;zyli2020;阿琛;狄新凯;范吉斌;冯一航;胡彬;宦晓玲;黄勇;康伟;雷仪婧;李良灿;李林杰;刘崇鸣;刘力力;刘勇琪;刘子涵;吕浩宇;王禹程;熊攀;徐安越;徐永飞;俞涵;张王泽;张栩浩;郑裔;周莉莉;周先琪;朱家兴;邹文祥 - -欢迎以任何形式对项目提供贡献! - -## MindSpore 2.3.0-rc2 Release Notes - -### 主要特性和增强 - -#### AutoParallel - -- [STABLE] Transpose/Sub/Add/Mul/Div/ReLU/Softmax/Sigmoid算子支持配置Layout。 -- [STABLE] 集合通信精度会影响网络收敛,在接口mindspore.set_auto_parallel_context提供配置项[force_fp32_communication](https://www.mindspore.cn/docs/zh-CN/r2.3.0rc2/api_python/mindspore/mindspore.set_auto_parallel_context.html),设为True时可以强制将reduce类通信算子的通信类型转为float32。 -- [BETA] pipeline并行支持Interleave调度,优化micro batch大小受限场景下的模型性能。 -- [BETA] 优化pipeline并行场景下提高模型转换速度,支持单个stage单独转换。 - -#### PyNative - -- [BETA] 动态图下支持[重计算](https://www.mindspore.cn/docs/zh-CN/r2.3.0rc2/api_python/mindspore/mindspore.recompute.html)功能。 -- [STABLE] 动态图下支持[register_hook](https://www.mindspore.cn/docs/zh-CN/r2.3.0rc2/api_python/mindspore/Tensor/mindspore.Tensor.register_hook.html#mindspore.Tensor.register_hook)功能。 - -### API变更 - -增加[动态组网](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.0rc2/parallel/dynamic_cluster.html)场景下各类超时时间环境变量配置: - -- `MS_TOPO_TIMEOUT`: 集群组网阶段超时时间,单位:秒。 -- `MS_NODE_TIMEOUT`:节点心跳超时时间,单位:秒。 -- `MS_RECEIVE_MSG_TIMEOUT`:节点接收消息超时时间,单位:秒。 - -新增环境变量 `MS_ENABLE_LCCL`,支持昇腾后端单机多卡场景下使用LCCL通信库。 - -### 问题修复 - -- [#I9CR96](https://gitee.com/mindspore/mindspore/issues/I9CR96) 修复在大规模集群下,动态组网启动方式的超时时间不足导致集群启动失败的问题。 -- [#I94AQQ](https://gitee.com/mindspore/mindspore/issues/I94AQQ) 修复ops.Addcdiv算子在图模式下输出shape有误问题。 - -### 贡献者 - -感谢以下人员做出的贡献: - -bantao,caifubi,changzherui,chenfei_mindspore,chenweifeng,dairenjie,dingjinshan,fangzehua,fanyi20,fary86,GuoZhibin,hanhuifeng,haozhang,hedongdong,Henry Shi,huandong1,huangbingjian,huoxinyou,jiangchenglin3,jiangshanfeng,jiaorui,jiaxueyu,jxl,kairui_kou,lichen,limingqi107,liuluobin,LLLRT,looop5,luochao60,luojianing,maning202007,NaCN,niyuxin94520,nomindcarry,shiziyang,tanghuikang,TronZhang,TuDouNi,VectorSL,wang_ziqi,wanghenchang,wudawei,XianglongZeng,xiaoxiongzhu,xiaoyao,yanghaoran,Yanzhi_YI,yao_yf,yide12,YijieChen,YingLai Lin,yuchaojie,YuJianfeng,zangqx,zhanghanLeo,ZhangZGC,zhengzuohe,zhouyaqiang0,zichun_ye,zjun,ZPaC,zyli2020,冯一航,李林杰,刘力力,王禹程,俞涵,张栩浩,朱家兴,邹文祥 - -欢迎以任何形式对项目提供贡献! - -## MindSpore Lite 2.3.0-rc2 Release Notes - -### 主要特性和增强 - -- [STABLE] 支持云侧转换工具所用的配置文件配置FlashAttention相关属性。 -- [STABLE] 支持在多张卡上进行内存共享。 - -### 贡献者 - -感谢以下人员做出的贡献: - -emmmmtang,熊攀 - -欢迎以任何形式对项目提供贡献! - -## MindSpore 2.3.0-rc1 Release Notes - -### 主要特性及增强 - -#### DataSet - -- [STABLE] MindRecord模块增加完整性校验、加解密功能,以此保护用户数据的完整性与安全性。 -- [STABLE] MindRecord接口变更:废弃FileWriter.open_and_set_header接口,因为其功能已内置到FilterWriter类,若使用旧版本代码将报错,删除此调用即可;FileWriter增加写入数据类型校验,以确保Schema定义的数据类型与真实数据类型匹配;Mindrecord组件下所有类方法去除返回值,若处理出错以异常方式提示用户。 -- [STABLE] 为以下数据增强增加Ascend处理后端支持:ResizedCrop、HorizontalFlip、VerticalFlip、Perspective、Crop、Pad、GaussianBlur、Affine。 -- [STABLE] 优化了模型迁移场景中数据迁移部分的指南,提供更多与第三方库框架对比的例子。 -- [STABLE] 优化了TFRecordDataset在多数据列场景下解析效率,提升解析性能 20%。 - -#### PIJIT - -- [BETA] PIJit通过对Python字节码进行分析&调整、执行流进行图捕获&图优化,支持的Python代码做静态图方式执行,不支持的进行子图切分以动态图方式执行,自动地做到动静统一。用户可以通过@jit(mode="PIJit", jit_config={options:value})对函数进行装饰来开启PIJit。 - -#### Inference - -- [DEMO] 大模型推理升级训推一体架构,实现脚本、分布式策略和运行时的统一,典型大模型训练到推理部署周期下降到天级,通过融合大算子降低推理时延,有效提升网络吞吐量。 - -#### AutoParallel - -- [STABLE] 新增msrun启动方式,支持单指令拉起分布式任务。 -- [STABLE] 添加RankTable启动方式即将废弃的提示。 -- [STABLE] 图模式下消除冗余常量,提升编译性能和内存开销。 -- [STABLE] 子图场景优化器并行首个子图inline,使得流水并行下的一些计算和通信掩盖可以进行。 -- [STABLE] 通信信息导出,编译期间导出模型通信信息(通信域、通信量),输入给集群作为通信调度的依据。 -- [STABLE] 流水线并行推理优化,去除共享权重在stage间转发,提升执行性能;支持流水线并行推理结果自动广播,提升自回归推理易用性。 -- [STABLE] 算子级并行切分支持配置MatMul/Add/LayerNorm/GeLU/BiasAdd算子的切分时的设备排布与张量排布的映射关系。 -- [STABLE] 支持数据并行维度的梯度通信与反向计算互相掩盖功能。 -- [STABLE] 单卡模拟编译,用于模拟多卡分布式训练中某张卡的编译流程,辅助分析前后端各编译流程和内存占用。 -- [STABLE] ops.Tril算子支持切分,从而降低对单个device的内存与性能需求。 -- [BETA] 支持通信算子和计算算子融合,掩盖通信开销,提升网络性能。 -- [BETA] 故障恢复时,checkpoint加载与编译并行从而减少故障恢复时间。 - -#### Runtime - -- [BETA] 支持O0/O1/O2多级编译,提升静态图调试调优能力。 - -#### FrontEnd - -- [STABLE] 框架新增对bfloat16数据类型的支持,创建Tensor时可以指定dtype=mindspore.bfloat16。 -- [STABLE] 完善rewrite组件的语法支持能力,新增支持对类变量、函数、控制流等语法的解析。 -- [STABLE] 新增context配置项:debug_level,用户可以使用mindspore.set_context(debug_level=mindspore.DEBUG)来获取更多调试信息。 - -#### Profiler - -- [BETA] 动态启停profiling,用户可以根据训练情况实时采集profiling 数据,减少采集数据量。 -- [BETA] Profiling通信算子耗时矩阵,用户通过分析通信算子耗时矩阵,找出集群通信性能瓶颈。 -- [BETA] 提高昇腾环境解析Profiling数据的性能。 -- [BETA] 支持离线解析Profiling生成的数据,用户可以先采集数据,然后根据需要再解析数据。 -- [BETA] 支持采集片上内存、PCIe、l2_cache性能数据,丰富性能分析指标。 - -#### Dump - -- [BETA] Dump保存的统计信息记录MD5值,用户可以通过MD5值确定张量值的微小差异。 -- [BETA] Dump支持bfloat16数据类型,支撑用户定位bfloat16类型的算子精度问题。 - -#### PyNative - -- [STABLE] 重构动态图下单算子调用流程,优化前端算子下发粒度,提升动态图性能。 - -#### Ascend - -- [BETA] 支持用户设置CANN的options配置项,配置项分为global和session二类,用户可以通过mindspore.set_context(ascend_config={"ge_options": {"global": {"global_option": "option_value"}, "session": {"session_option": "option_value"}}})进行配置。 - -#### API变更 - -- 新增 mindspore.hal接口,开放流、事件以及设备管理能力。 -- 新增 mindspore.multiprocessing 接口,提供了创建多进程的能力。 - -#### 算子 - -- [BETA] mindspore.ops.TopK当前支持第二个输入k为Int32类型的张量。 - -### 问题修复 - -- [#I92H93] 修复了昇腾平台下使用Print算子打印字符串对象时,Print算子报错Launch kernel failed的问题。 -- [#I8S6LY] 修复了昇腾平台图模式动态shape流程下,变长输入算子(如 AddN、Concat)报错RuntimeError: Attribute dyn_input_sizes of Default/AddN-op1 is [const vector]{}, of which size is less than 0的问题。 -- [#I9ADZS] 修复了故障恢复训练场景中,由于dataset恢复效率低导致网络训练出现数据超时的问题。 - -### 贡献者 - -感谢以下人员做出的贡献: - -AlanCheng511,AlanCheng712,bantao,Bingliang,BJ-WANG,Bokai Li,Brian-K,caifubi,cao1zhg,CaoWenbin,ccsszz,chaiyouheng,changzherui,chenfei_mindspore,chengbin,chengfeng27,chengxb7532,chenjianping,chenkang,chenweifeng,Chong,chuht,chujinjin,Cynthia叶,dairenjie,DavidFFFan,DeshiChen,douzhixing,emmmmtang,Erpim,fangzhou0329,fary86,fengxun,fengyixing,fuhouyu,gaoshuanglong,gaoyong10,GaoZhenlong,gengdongjie,gent1e,Greatpan,GTT,guoqi,guoxiaokang1,GuoZhibin,guozhijian,hangq,hanhuifeng,haozhang,hedongdong,hejianheng,Henry Shi,heyingjiao,HighCloud,Hongxing,huandong1,huangbingjian,HuangLe02,huangxinjing,huangziling,hujiahui8,huoxinyou,jiangchenglin3,jianghui58,jiangshanfeng,jiaorui,jiaxueyu,JichenZhao,jijiarong,jjfeing,JoeyLin,JuiceZ,jxl,kairui_kou,kate,KevinYi,kisnwang,lanzhineng,liangchenghui,LiangZhibo,lianliguang,lichen,ligan,lihao,limingqi107,ling,linqingke,liruyu,liubuyu,liuchao,liuchengji,liujunzhu,liuluobin,liutongtong9,liuzhuoran2333,liyan2022,liyejun,LLLRT,looop5,luochao60,luojianing,luoyang,LV,machenggui,maning202007,Margaret_wangrui,MaZhiming,mengyuanli,MooYeh,moran,Mrtutu,NaCN,nomindcarry,panshaowu,panzhihui,PingqiLi,qinzheng,qiuzhongya,Rice,shaojunsong,Shawny,shenwei41,shenyaxin,shunyuanhan,silver,Songyuanwei,tangdezhi_123,tanghuikang,tan-wei-cheng,TingWang,TronZhang,TuDouNi,VectorSL,WANG Cong,wang_ziqi,wanghenchang,wangpingan,wangshaocong,wangtongyu6,weiyang,WinXPQAQ,wtcheng,wudawei,wujiangming,wujueying,wuweikang,wwwbby,XianglongZeng,xiaosh,xiaotianci,xiaoxin_zhang,xiaoxiongzhu,xiaoyao,XinDu,xingzhongfan,yanghaoran,yangluhang,yangruoqi713,yangzhenzhang,yangzishuo,yanjiaming,Yanzhi_YI,yao_yf,yefeng,yeyunpeng2020,yide12,YijieChen,YingLai Lin,YingtongHu,youshu,yuchaojie,YuJianfeng,zangqx,zby,zhaiyukun,zhangdanyang,zhanghaibo,zhanghanLeo,zhangminli,zhangqinghua,zhangyanhui,zhangyifan,zhangyinxia,zhangyongxian,ZhangZGC,zhanzhan,zhaoting,zhengyafei,zhengzuohe,ZhihaoLi,zhouyaqiang0,zhuguodong,zhumingming,zhupuxu,zichun_ye,zjun,zlq2020,ZPaC,zuochuanyong,zyli2020,陈宇,代宇鑫,狄新凯,范吉斌,冯一航,胡彬,宦晓玲,黄勇,康伟,李良灿,李林杰,刘崇鸣,刘力力,刘勇琪,吕浩宇,没有窗户的小巷,王禹程,吴蕴溥,熊攀,徐安越,徐永飞,许哲纶,俞涵,张峻源,张树仁,张王泽,张栩浩,郑裔,周莉莉,周先琪,朱家兴,邹文祥 - -欢迎以任何形式对项目提供贡献! - -## MindSpore 2.2.13 Release Notes - -### API变更 - -增加动态组网场景下各类超时时间环境变量配置: - -- `MS_TOPO_TIMEOUT`: 集群组网阶段超时时间,单位:秒。 -- `MS_CLUSTER_RETRY_NUM`:集群组网阶段节点重试注册次数。 -- `MS_NODE_TIMEOUT`:节点心跳超时时间,单位:秒。 -- `MS_RECEIVE_MSG_TIMEOUT`:节点接收消息超时时间,单位:秒。 - -### 问题修复 - -- [#I9CR96] 修复在大规模集群下,动态组网启动方式的超时时间不足导致集群启动失败的问题。 - -### 贡献者 - -感谢以下人员做出的贡献: - -ZPaC, limingqi107, lizhenyu, jiangshanfeng - -欢迎以任何形式对项目提供贡献! - -## MindSpore 2.2.12 Release Notes - -### 主要特性及增强 - -- [STABLE] 针对网络参数以fp32初始化以及开启优化器并行的场景,降低Cast算子数目。 -- [STABLE] 增加对静默故障的检测和处理能力;静默故障会导致训练过程异常,该特性帮助用户避免或大幅降低因静默故障导致的集群停机巡检进行故障定位带来的损失。 - -### 问题修复 - -- [#I97D1L] 修复 ReduceLROnPlateau、LRScheduler、CosineAnnealingWarmRestarts动态学习率相关接口样例错误。 -- [#I970HV] 修复多卡之间的allgather/reducescatter不保序问题。 -- [#I99JPI] 修复checkpoint在模糊匹配场景下加载类型为bfloat16 parameter的 bug。 - -### 贡献者 - -感谢以下人员做出的贡献: - -yao_yf, YijieChen, 冯一航, yuchaojie, 李良灿, YuJianfeng, huangxinjing, GuoZhibin, looop5 - -欢迎以任何形式对项目提供贡献! - -## MindSpore 2.2.11 Release Notes - -### 主要特性及增强 - -#### scipy - -- [STABLE] 新增scipy模块API mindspore.scipy.optimize.linear_sum_assignment,用于解决线性和分配问题,它可以基于一个给定的成本矩阵,找到一个成本最低的分配方案。 - -### 问题修复 - -- [#I8JVRU] 修复bernoulli随机数算子在GPU上跑两次的结果出现概率性一致的问题。 -- [#I8OC32] 修复MatrixSetDiagV3算子未校验异常输入,导致segmentation fault问题。 - -### 贡献者 - -感谢以下人员做出的贡献: - -fary86, wanghenchang, haozhang, mengyuanli, emmmmtang, luoyang, zhupuxu, zhangyongxian, liuluobin, LLLRT, TuDouNi, hujiahui8, wangtongyu6, ligan, zhuguodong, yanghaoran, YingtongHu, liyejun, zjun, 徐永飞, chuht, 张树仁, 徐安越, DeshiChen, shenyaxin, liujunzhu, shunyuanhan, yuchaojie, yao_yf, 没有窗户的小巷, yeyunpeng2020, weiyang, KevinYi, hedongdong, zhouyaqiang0, Margaret_wangrui, zhanghaibo, moran, huangziling, 朱家兴, GuoZhibin, 李良灿, jiaxueyu, gaoyong10, Greatpan, 宦晓玲, melody, 俞涵, jiangshanfeng, XinDu, ling, caifubi, zhangyinxia, gengdongjie, Erpim, XianglongZeng, zhangminli, fengyixing, 冯一航, 黄勇, panzhihui, 胡彬, linqingke, wangshaocong - -欢迎以任何形式对项目提供贡献! - -## MindSpore Lite 2.2.11 Release Notes - -### 问题修复 - -- [#I8TPLY] 修复 SSD MobileNetV2 FPN 网络在Atlas 推理系列产品平台上的推理失败问题。 - -### 贡献者 - -感谢以下人员做出的贡献: - -wangtongyu6, zhuguodong, 徐永飞, 徐安越, yeyunpeng2020, moran, XinDu, gengdongjie. - -欢迎以任何形式对项目提供贡献! - -## MindSpore 2.2.10 Release Notes - -### 主要特性及增强 - -#### 算子 - -- [STABLE] FastGelu、BatchMatMul、AllReduce、AllGather、Broadcast、ReduceScatter算子支持bfloat16数据类型 -- [STABLE] AllGather支持uint8数据类型 - -### 问题修复 - -- [#I8ALW3]修复Faster R-CNN、DeepTextMask、RCNN-ResNet50等网络在Ascend上8卡训练RandomChoiceWithMask算子报错问题 -- [#I8LKG7]修复UNet-2D在Ascend 1卡、8卡图编译报错问题 -- [#I8KU3X]修复CRNN-ResNet34在Ascend 1卡、8卡PyNative模式下训练进程卡住问题 -- [#I8KTHH]修复在Ascend 8卡上使能enable_parallel_optimizer=True,不使用allreduce分组融合时,BERT网络训练报错问题 - -### 贡献者 - -感谢以下人员做出的贡献: - -李林杰, TuDouNi, chengxb7532, Henry Shi, rms-infer-type, 朱家兴, zhouyaqiang0, tanghuikang, gaoyong10, gengdongjie, yao_yf, hujiahui8, hanhuifeng, shenyaxin, KevinYi, 冯一航, chengfeng27, JuiceZ, zhangyanhui, jijiarong, xiaoxiongzhu, 没有窗户的小巷, ling, liyan2022, haozhang, zangqx, xiaoyao, liujunzhu, 胡彬, panzhihui, wangshaocong, linqingke, jianghui58, qiuzhongya, yangruoqi713, zhangminli, moran, 王禹程, shaojunsong, wangtongyu6, zhupuxu, luoyang, 徐安越, qinzheng, caifubi, 徐永飞, chenkang, youshu, XinDu, liubuyu, jxl, yeyunpeng2020, huoxinyou, yefeng, jiaorui, wangpingan, cao1zhg, zjun, zyli2020, yanjiaming, Cynthia叶, 胡安东, 李良灿, liruyu, liuluobin, lihao, huangbingjian, YijieChen, jjfeing, looop5, 刘力力, xiaoxin_zhang, yangluhang, chenweifeng, jiangshanfeng, zichun_ye, 陈宇, NaCN, ligan, YingLai Lin, huangziling, chenjianping, DeshiChen, chengbin, kairui_kou, ccsszz, yanghaoran, zhangdanyang, Yanzhi_YI, zhengzuohe, hangq, TronZhang, wanghenchang, HighCloud, 吕浩宇, VectorSL, ZPaC, mengyuanli, maning202007, 刘勇琪, r1chardf1d0, fary86, 刘崇鸣, yuchaojie, douzhixing, fengyixing - -欢迎以任何形式对项目提供贡献! - -## MindSpore Lite 2.2.10 Release Notes - -### 问题修复 - -- [#I8K7CC]优化get_model_info接口传入非str字段的报错 - -### 贡献者 - -感谢以下人员做出的贡献: - -gengdongjie, zhangyanhui, xiaoxiongzhu, wangshaocong, jianghui58, moran, wangtongyu6, 徐安越, qinzheng, 徐永飞, youshu, XinDu, yeyunpeng2020, yefeng, wangpingan, zjun, 胡安东, 刘力力, 陈宇, chenjianping, kairui_kou, zhangdanyang, hangq, mengyuanli, 刘崇鸣 - -欢迎以任何形式对项目提供贡献! - -## MindSpore 2.2.1 Release Notes - -### Bug Fixes - -- [#I7R3R5] 修复昇腾平台ResNet-50网络精度劣化问题。 -- [#I8A9RH] 修复昇腾平台DBNet(ResNet-50)网络精度劣化问题。 -- [#I8B8IW] 修复多维Tensor赋值越界导致段错误的问题。 -- [#I8J0F4] 修复多维Tensor扩展维度在动态图执行失败的问题。 -- [#I87P3P] 修复昇腾平台二次训练编译缓存加载失败的问题。 -- [#I86GP9] 修复昇腾平台UNet3D网络推理精度劣化问题。 -- [#I89B4K] 修复Windows平台动态图动态rank执行卡住的问题。 -- [#I8CX0C] 修复昇腾平台上动态图混合精度模式下偶现失败的问题。 -- [#I8BGCF] 修复昇腾平台AIRNet网络动态图模式下执行出现段错误的问题。 -- [#I8L5DS] 修复昇腾平台ResNet-50图像分割网络动态图执行慢的问题。 - -### 贡献者 - -感谢以下人员做出的贡献: - -yufan, dingcheng, lvzhangcheng, zhunaipan, fangwenyi, weiyang, changzherui, chujinjin, zangqingxiang, yuchaojie, wuweikang, tanghuikang, xiaoyao, huangbinjian, zhoupeichen, chenfei_mindspore, hedongdong, wangnan, zhengzuohe, yanghaoran, zouliqin, luoyang, liuchongmin, lujiale, machenggui, wangcong, lixiangyi, wangting, huangyong - -欢迎以任何形式对项目提供贡献! - -## MindSpore Lite 2.2.1 Release Notes - -### Bug Fixes - -- [#I88055] 修复MindSpore Lite推理gridsample算子format设置错误的问题。 -- [#I8D80Y] 修复MindSpore Lite推理单算子调用流程资源释放异常的问题。 - -### 贡献者 - -感谢以下人员做出的贡献: - -zhanghaibo, wangsiyuan, yefeng, wangshaocong, chenjianping - -欢迎以任何形式对项目提供贡献! - -## MindSpore 2.2.0 Release Notes - -### 主要特性和增强 - -#### DataSet - -- [STABLE] 数据操作map/batch的`row_size`参数扩展支持传入list,代表[输入共享内存, 输出共享内存],以便在多进程模式时灵活控制共享内存的大小。 -- [STABLE] 为官网API文档页面mindspore.dataset、mindspore.dataset.transforms、mindspore.mindrecord的所有API补充完善样例,方便用户参考。 -- [STABLE] ConcatDataset支持全局采样能力,即使用concat操作组合多来源数据后,可以对数据进行全局随机采样以增强数据多样性。 -- [STABLE] 使用model.train接口训练时,支持通过TimeMonitor(.., data_time=True)实时监控数据处理性能。 -- [STABLE] 引入jemalloc库,解决在极端场景下,因内存碎片回收不及时导致内存缓慢上涨问题。 - -#### FrontEnd - -- [STABLE] 支持添加@lazy_inline装饰器来标注Cell生成的子图延迟inline,从而有效提升编译性能。 -- [STABLE] 新增CellDict数据结构,支持构建Dict类型的Cell对象,完善构建网络能力。 -- [STABLE] 混合精度训练的功能优化,支持通过rewrite自动改写python脚本实现混合精度策略,支持函数、分支语句等多种语法自动解析。 -- [STABLE] 动态学习率功能优化,新增MultiStepLR等API;get_lr方法与global_step解耦,扩展优化器模块功能。 -- [STABLE] 优化API代码样例、API差异表以及高阶函数使用教程。 - -#### 算子 - -- [STABLE] 新增算子原语`mindspore.ops.Dense`。 -- [STABLE] 新增随机数算子状态管理功能,使随机数算子可以保存随机数状态,并在模型并行、重计算等场景稳定复现。当前仅支持CPU/GPU平台,涉及的随机数算子包括:`mindspore.ops.Multinomial`、`mindspore.ops.MultinomialWithReplacement`、`mindspore.ops.ParameterizedTruncatedNormal`、`mindspore.ops.StandardLaplace`、`mindspore.ops.StandardLaplace`、`mindspore.ops.Uniform`、`mindspore.ops.UniformInt`、`mindspore.ops.UniformReal`、`mindspore.ops.UniformInt`、`mindspore.ops.Dropout`、`mindspore.ops.RandomChoiceWithMask`、`mindspore.ops.RandomCategorical`、`mindspore.ops.RandomShuffle`、`mindspore.ops.RandamGamma`、`mindspore.ops.RandomPoisson`、`mindspore.ops.TruncatedNormal`。 -- [STABLE] 当GPU算子遇到非法输入场景,支持在算子的CUDA核函数中异步打印报错日志到Host侧,并中断当前CUDA Stream的执行,提高用户算子问题的定位效率。 - -#### PyNative - -- [STABLE] PyNative模式下支持View机制。 -- [STABLE] PyNative模式下功能增强:sens支持dict类型输入。 - -#### Ascend - -- [STABLE] 支持用户可配置算子高精度/高性能模式,用户可以通过`mindspore.set_context(ascend_config={"op_precision_mode": "/path/to/op_precision_config_file"})`对部分TBE算子配置高精度/高性能模式。 -- [BETA] 支持用户可配置fp16进fp32出的算子,用户可以通过`mindspore.set_context(ascend_config={"precision_mode": "force_fp32"})`对TBE Cube算子配置fp16进fp32出。 -- [BETA] 去除jit level "O3"与GE流程强绑定,用户在执行GE流程时无需再设置`jit_level="O3"`。 - -#### Parallel - -- [STABLE] 支持半自动/全自动模式下,非流水线并行场景的梯度累加特性,用户可以通过`net = GradAccumulationCell(net, micro_size)`方式,对网络使能梯度累加。梯度累加特性同样支持LazyInline编译加速。 - -#### 推理 - -自2.2版本起MindSpore主发布包不再提供配套310的推理接口使能,如需使用请切换安装MindSpore Lite发布包或下载MindSpore2.0之前的版本。MindSpore lite的安装部署与用法详见 。昇腾(Ascend)310是面向边缘场景的高能效高集成度AI处理器,支持对MindIR格式模型进行推理。原先MindSpore提供了两种在Ascend 310硬件上的推理使能用法: - -1. 由MindSpore主发布包提供配套Ascend 310的版本,支持C++推理接口。 -2. 由MindSpore Lite发布包提供配套Ascend的版本,支持C++/Java两种语言进行推理。 - -这两种方案提供的C++ API基本一致,后续不再构建和维护两套接口,而是归一使用MindSpore Lite。原有基于MindSpore主发布包构建的310推理业务,可以少量修改切换到MindSpore Lite,详见 。 - -### Bug fixes - -- [I7SDA0] 修复了昇腾平台上CRNN网络精度劣化的问题。 -- [I7T4QK] 修复了昇腾平台上wgan网络推理精度劣化问题。 -- [I7TJ8Z] 修复了昇腾平台上lgtm网络推理精度劣化问题。 -- [I7M58O] 修复了昇腾平台上ASR-dynamic网络训练core-dump的问题 -- [I7L6B6] 修复了dataset多进程模式时,子进程在某些场景不退出的问题。 -- [I7L7AE] 修复了dataset处理中包含repeat操作,且dataset.batch中使用动态batch时,batchinfo.get_epoch_num()计算不正确的问题。 -- [I7UY7G] 修复OBSMindDataset中对于文件权限修改的异常的报错。 - -### 贡献者 - -感谢以下人员做出的贡献: -bantao, Bingliang, BJ-WANG, Brian-K, caifubi, ccsszz, changzherui, chenfei_mindspore, chengfeng27, chenhaozhe, chenjianping, chenkang, chenweifeng, chuht, chujinjin, CShu0507, Cynthia叶, DeshiChen, douzhixing, Erpim, Etienne, fary86, fengxun, fengyixing, gaoshuanglong, Gaoxiong, gaoyong10, GaoZhenlong, Greatpan, GuoZhibin, guozhijian, hangq, hanhuifeng, haozhang, hedongdong, Henry Shi, HighCloud, Hongxing, huangbingjian, huanghui, huangxinjing, huangziling, hujiahui8, huoxinyou, HWalkingMan, jianghui58, jiangshanfeng, jiaorui, jijiarong, jjfeing, JuiceZ, jxl, KevinYi, kisnwang, KXiong, lanzhineng, Li Qingguo, LiangZhibo, lianliguang, ligan, lihao, Lihoon, limingqi107, ling, linqingke, liruyu, liubuyu, liuchao, liujunzhu, liuluobin, liupeng303, liutongtong9, liyan2022, liyejun, looop5, luochao60, luojianing, luoyang, machenggui, maning202007, Margaret_wangrui, MaZhiming, mengyuanli, moran, NaCN, nomindcarry, panshaowu, panzhihui, qinzheng, qiuzhongya, r1chardf1d0, shaojunsong, shenwei41, shenyaxin, shenzhangyi, Shira Zaloshinski, shunyuanhan, tangdezhi_123, tanghuikang, tan-wei-cheng, tan-wei-cheng-3260, TronZhang, TuDouNi, VectorSL, wang_ziqi, wanghenchang, wangpingan, wangshaocong, wangtongyu6, wtcheng, wujueying, XianglongZeng, xiaotianci, xiaoxin_zhang, xiaoxiongzhu, xiaoyao, xiaoyuanyuan, XinDu, xujinliang, xupan, yanghaoran, yangluhang, yangruoqi713, yangsijia, yangzhenzhang, yangzishuo, yanjiaming, Yanzhi_YI, yao_yf, yefeng, yeyunpeng2020, yide12, YijieChen, YingLai Lin, YingtongHu, yonibaehr, youshu, yuchaojie, YuJianfeng, zangqx, zhaizhiqiang, zhangbuxue, zhangchunlei, zhangdanyang, zhangdong, zhanghaibo, zhangminli, zhangqi, zhangqinghua, zhangyanhui, zhangyifan, zhangyongxian, zhangzhen, zhangzheng, zhanzhan, zhengzuohe, ZhihaoLi, zhoufeng, zhouyaqiang0, zhuguodong, zhupuxu, zichun_ye, zjun, ZPaC, zuochuanyong, zyli2020, 陈宇, 程超, 范吉斌, 冯浩, 冯一航, 胡彬, 宦晓玲, 黄勇, 雷元哲, 黎冠新, 李良灿, 李林杰, 刘崇鸣, 刘力力, 刘思铭, 刘勇琪, 吕浩宇, 没有窗户的小巷, 沈竞兴, 王禹程, 王振邦, 徐安越, 徐永飞, 俞涵, 张澍坤, 周超, 朱家兴 - -欢迎以任何形式对项目提供贡献! - -## MindSpore Lite 2.2.0 Release Notes - -### 主要特性和增强 - -#### 支持FlashAttention算子融合 - -- [STABLE] 在Ascend系列硬件上,支持LLAMA、stable diffusion系列模型的FlashAttention大算子融合。 - -## MindSpore 2.1.1 Release Notes - -### Bug fixes - -- [I7Q9RX] 昇腾平台支持不同硬件类型自适应识别。 -- [I7SDA0] 修复了昇腾平台上CRNN网络精度劣化的问题。 -- [I7T4QK] 修复了昇腾平台上wgan网络推理精度劣化问题。 -- [I7TJ8Z] 修复了昇腾平台上lgtm网络推理精度劣化问题。 - -### 贡献者 - -感谢以下人员做出的贡献: - -changzherui, chenfei_mindspore, chenjianping, chenkang, chenweifeng, chujinjin, fangwenyi, GuoZhibin, guozhijian, hangq, hanhuifeng, haozhang, hedongdong, 尤澍, zhoufeng, 代宇鑫 - -欢迎以任何形式对项目提供贡献! - -## MindSpore Lite 2.1.1 Release Notes - -### Major Features and Improvements - -- [STABLE] MindSpore Lite Cloud Inference adds support for Python 3.8 and Python 3.9 - -## MindSpore 2.1.0 Release Notes - -### 主要特性和增强 - -#### FrontEnd - -- [BETA] JIT Fallback支持变量场景:静态图模式下,支持返回Dict类型和Scalar类型,支持对非Parameter类型对象进行属性设置, 支持List的部分就地修改操作,完善支持NumPy等第三方库,支持用户自定义类的相关操作,支持Python基础运算符、内置函数使用更多数据类型,兼容控制流、副作用、自动微分等功能。具体用法请参考[静态图语法支持](https://www.mindspore.cn/docs/zh-CN/r2.1/note/static_graph_syntax_support.html)。 -- [BETA] 静态图模式下,优化控制流场景中使用未定义变量的报错。使用if、while、for控制流分支内定义的变量,需在控制流之前初始化定义变量。 -- [STABLE] 新增ReWrite功能,支持基于自定义规则修改网络结构,提供对多个网络进行批量修改的能力。 -- [BETA] 新增optim_ex优化器模块,扩展现有功能,支持全量优化器参数分组策略的设置、支持运行中通过赋值的方式修改参数等功能。 -- [STABLE] 优化MindSpore与PyTorch的API映射表,详细介绍API在功能、参数、输入、输出和特定场景等方面的差异。 - -#### PyNative - -- 优化动态图模式下动态shape场景的性能。 - -#### DataSet - -- [STABLE] 优化MindRecord数据文件的内存结构,加载百TB级别数据训练可降低60%内存占用。 -- [STABLE] 支持单线程执行数据处理Pipeline,以便用户在数据Pipeline中添加代码对数据处理功能进行调试。 -- [STABLE] 优化了TFRecordDataset的性能,对数据集加载性能提升60%+;优化了batch的性能,对于batch数较大的使用场景性能提升30%。 -- [STABLE] 优化API文档[mindspore.dataset](https://www.mindspore.cn/docs/zh-CN/r2.1/api_python/mindspore.dataset.html) 和 [mindspore.dataset.transforms](https://www.mindspore.cn/docs/zh-CN/r2.1/api_python/mindspore.dataset.transforms.html)的Example示例,并新增了四篇样例库展示数据增强的效果,分别是:[使用数据Pipeline加载 & 处理数据集](https://www.mindspore.cn/docs/zh-CN/r2.1/api_python/mindspore.dataset.html#%E6%95%B0%E6%8D%AE%E5%A4%84%E7%90%86pipeline%E5%BF%AB%E9%80%9F%E4%B8%8A%E6%89%8B)、[视觉变换样例库](https://www.mindspore.cn/docs/zh-CN/r2.1/api_python/mindspore.dataset.transforms.html#%E6%A0%B7%E4%BE%8B%E5%BA%93)、[文本变换样例库](https://www.mindspore.cn/docs/zh-CN/r2.1/api_python/mindspore.dataset.transforms.html#%E6%A0%B7%E4%BE%8B%E5%BA%93-1)、[音频变换样例库](https://www.mindspore.cn/docs/zh-CN/r2.1/api_python/mindspore.dataset.transforms.html#%E6%A0%B7%E4%BE%8B%E5%BA%93-2) - -#### AutoParallel - -- [STABLE] 支持训练过程将参数或者中间结果offload到CPU或NVMe,用户通过配置context开启自动offload功能,扩大可训练模型规模。 - -- [STABLE] 自动并行能力增强: - - 1. 典型网络自动策略性能不低于默认配置的90%; - - 2. 支持3D混合并行训练:自动算子级策略生成结合手动配置pipeline切分。 - -#### Runtime - -- [STABLE] 升级OpenMPI版本至4.1.4。 -- [STABLE] 升级NCCL版本至2.16.5。 -- [STABLE] 动态组网场景下单节点内多卡rank连续分配。 -- [STABLE] 动态组网场景下用户无需在脚本中对Scheduler角色进行适配,Scheduler与Worker脚本可保持完全一致。 - -#### Ascend - -- [STABLE] 算子执行发生AIC Error时日志支持输出辅助AIC Error定位的维测信息,信息包括算子task名字、stream id、输入输出及workspace地址等。 -- [STABLE] 针对算子输出为空Tensor的场景为CANN算子提供默认的处理机制,即跳过其算子执行。 -- [STABLE] 在图模式网络模型执行失败时补充相关定位信息,即在rank_${id}/exec_order/目录下产生csv文件,记录每个task的task id和stream id。 - -#### Profiler - -- [STABLE] Profiler支持收集Host侧各个阶段耗时数据。 -- [BETA] Profiler支持收集Host侧各个阶段内存数据。 -- [BETA] Profiler支持收集数据处理算子耗时。 - -### API变更 - -- `mindspore.dataset.GraphData`、`mindspore.dataset.Graph`、`mindspore.dataset.InMemoryGraphDataset`、`mindspore.dataset.ArgoverseDataset`不再进行功能演进并废弃。使用[MindSpore Graph Learning](https://gitee.com/mindspore/graphlearning)进行相关功能替换。对于Model仓库使用到此API的相关网络进行替换时,GCN请参考[Graph Learning GCN](https://gitee.com/mindspore/graphlearning/tree/master/model_zoo/gcn),GAT请参考[Graph Learning GAT](https://gitee.com/mindspore/graphlearning/tree/master/model_zoo/gat)。 -- `mindspore.set_context`新增`jit_syntax_level`选项,用于设置JIT语法支持级别,请参考[set_context](https://www.mindspore.cn/docs/zh-CN/r2.1/api_python/mindspore/mindspore.set_context.html)。 -- 变更了`model.infer_predict_layout`接口,接口新增参数skip_backend_compile,默认值为False。当用户希望跳过后端编译流程获取参数切分策略时可选择设置为True。 - -#### 算子 - -- 新增算子原语`mindspore.ops.ApplyAdamWithAmsgradV2`,推荐通过接口`mindspore.nn.Adam`调用。 -- 新增算子原语`mindspore.ops.UpsampleTrilinear3D`,推荐通过接口`mindspore.ops.interpolate`调用。 -- 新增算子原语`mindspore.ops.UpsampleNearest3D`,推荐通过接口`mindspore.ops.interpolate`调用。 - -#### 接口弃用 - -- 弃用算子原语`mindspore.ops.ScatterNonAliasingAdd`,推荐使用算子原语`mindspore.ops.TensorScatterAdd`替换。 - -#### 非兼容性接口变更 - -- 接口名称:`mindspore.nn.Dense`、`mindspore.nn.Conv1d`、`mindspore.nn.Conv1dTranspose`、`mindspore.nn.Conv2d`、`mindspore.nn.Conv2dTranspose`、`mindspore.nn.Conv3d`、`mindspore.nn.Conv3dTranspose` - - 变更内容:变更了初始化参数策略。weight_init默认值由"normal"改为None,bias_init默认值由"zeros"改为None。 - - 说明:权重默认初始化方法由使用"normal"改为在内部使用HeUniform初始化。偏差默认初始化方法由"zeros"改为在内部使用Uniform初始化。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
原接口 v2.1接口
-  mindspore.nn.Dense(in_channels,
-                     out_channels,
-                     weight_init='normal',
-                     bias_init='zeros',
-                     has_bias=True,
-                     activation=None)
-  
-
-  mindspore.nn.Dense(in_channels,
-                     out_channels,
-                     weight_init=None,
-                     bias_init=None,
-                     has_bias=True,
-                     activation=None)
-  
-
-  mindspore.nn.Conv1d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init='normal',
-                      bias_init='zeros')
-  
-
-  mindspore.nn.Conv1d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init=None,
-                      bias_init=None)
-  
-
-  mindspore.nn.Conv1dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               dilation=1,
-                               group=1,
-                               has_bias=False,
-                               weight_init='normal',
-                               bias_init='zeros')
-  
-
-  mindspore.nn.Conv1dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               dilation=1,
-                               group=1,
-                               has_bias=False,
-                               weight_init=None,
-                               bias_init=None)
-  
-
-  mindspore.nn.Conv2d(in_channels,
-                      out_channels, kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init='normal',
-                      bias_init='zeros',
-                      data_format='NCHW')
-  
-
-  mindspore.nn.Conv2d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init=None,
-                      bias_init=None,
-                      data_format='NCHW')
-  
-
-  mindspore.nn.Conv2dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               output_padding=0,
-                               dilation=1,
-                               group=1,
-                               has_bias=False,
-                               weight_init='normal',
-                               bias_init='zeros')
-  
-
-  mindspore.nn.Conv2dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               output_padding=0,
-                               dilation=1,
-                               group=1,
-                               has_bias=False,
-                               weight_init=None,
-                               bias_init=None)
-  
-
-  mindspore.nn.Conv3d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init='normal',
-                      bias_init='zeros',
-                      data_format='NCDHW')
-  
-
-  mindspore.nn.Conv3d(in_channels,
-                      out_channels,
-                      kernel_size,
-                      stride=1,
-                      pad_mode='same',
-                      padding=0,
-                      dilation=1,
-                      group=1,
-                      has_bias=False,
-                      weight_init=None,
-                      bias_init=None,
-                      data_format='NCDHW')
-  
-
-  mindspore.nn.Conv3dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               dilation=1,
-                               group=1,
-                               output_padding=0,
-                               has_bias=False,
-                               weight_init='normal',
-                               bias_init='zeros',
-                               data_format='NCDHW')
-  
-
-  mindspore.nn.Conv3dTranspose(in_channels,
-                               out_channels,
-                               kernel_size,
-                               stride=1,
-                               pad_mode='same',
-                               padding=0,
-                               dilation=1,
-                               group=1,
-                               output_padding=0,
-                               has_bias=False,
-                               weight_init=None,
-                               bias_init=None,
-                               data_format='NCDHW')
-  
-
- -### Bug fixes - -- [I6TKLW] 修复了昇腾平台上MobileNetV2网络性能劣化的问题。 -- [I7CP5H] 修复了昇腾平台上ASR网络训练失败的问题。 -- [I7I3EZ] 修复了由于Pillow 10.0.0版本变更枚举接口导致run_check()失败的问题。若在低版本MindSpore遇到,则安装10.0.0以下版本Pillow避免此问题。 -- [I7IZ8K] 修复了assignsub接口在PyNative下的精度问题。 -- [I7HGY0] 修复了函数式编程,在PyNative模式数据下沉场景,loss不收敛的问题。 -- [I7J4N3] 修复了Profiler动态Shape模式下生成Step Trace失败的问题。 -- [I7J4N3] 修复了MindInsight并行策略视图展示暂无数据的问题。 -- [I79YY4] 修复了PyNative模式下高阶微分时的SiLU算子错误。 -- [I6NQJQ] 修复了PyNative模式下ScatterUpdate算子动态shape场景下执行概率性失败的问题。 -- [I6Y4G5] 修复了Graph模式下Conv3D算子动态Shape场景下执行失败的问题。 - -### 贡献者 - -感谢以下人员做出的贡献: - -alashkari,anzhengqi,archer2049,B.L.LAN,baihuawei,bichaoyang,BJ-WANG,Bokai Li,Brian-K,caifubi,caiyimeng,cathwong,changzherui,ChenDonYY,chenfei_mindspore,chengang,chengbin,chenhaozhe,chenjianping,chenkang,chenweifeng,chuht,chujinjin,davidanugraha,DavidFFFan,DeshiChen,douzhixing,emmmmtang,Erpim,Ethan,fangwenyi,fangzehua,fangzhou0329,fary86,fengyixing,gaoshuanglong,Gaoxiong,gaoyong10,gengdongjie,gongdaguo1,Greatpan,GuoZhibin,guozhijian,hangq,hanhuifeng,haozhang,hedongdong,Henry Shi,heterogeneous_to_backoff_2_0,huangbingjian,huanghui,huangxinjing,hujiahui8,hujingsong,huoxinyou,jachua,jiahongQian,jianghui58,jiangzhenguang,jiaorui,jiaoy1224,jijiarong,jjfeing,JoeyLin,json,JuiceZ,jxl,kairui_kou,KevinYi,kisnwang,KXiong,laiyongqiang,lanzhineng,liangchenghui,liangzelang,LiangZhibo,lianliguang,lichen,ligan,lijunbin,limingqi107,ling,linqingke,liubuyu,liuchao,liuchuting,liujunzhu,liuluobin,liutongtong9,liuyang811,lixiao,liyan2022,liyejun,liyuxia,looop5,luochao60,luojianing,luoyang,luoyuan,lyqlola,maning202007,maoyaomin,Margaret_wangrui,mayadong,MaZhiming,melody,mengyuanli,michaelzhu_70ab,Mohammad Motallebi,moran,NaCN,nomindcarry,OwenSec,panfengfeng,panshaowu,panzhihui,pkuliuliu,qinzheng,qiuzhongya,qujianwei,r1chardf1d0,Renyuan Zhang,RobinGrosman,shaojunsong,shenwei41,Soaringfish,tangdezhi_123,tanghuikang,tan-wei-cheng,TinaMengtingZhang,TronZhang,TuDouNi,VectorSL,wang_ziqi,wanghenchang,wangnan39,wangpingan,wangshaocong,wangshengnan123,wangtongyu6,weichaoran,wind-zyx,wqx,wtcheng,wujueying,wYann,XianglongZeng,xiaohanzhang,xiaotianci,xiaoyao,XinDu,xulei,xumengjuan1,xupan,xwkgch,yanghaoran,yangluhang,yangruoqi713,yangshuo,yangsijia,yangzhenzhang,yanzhenxiang2020,Yanzhi_YI,yao_yf,yefeng,yeyunpeng2020,Yi_zhang95,yide12,YijieChen,YingLai Lin,YingtongHu,youshu,yuchaojie,yuedongli,YuJianfeng,zangqx,ZengZitao,zhangbuxue,zhangdanyang,zhangdong,zhangfanghe,zhangqi,zhangqinghua,zhangyanhui,zhangyinxia,zhangyongxian,zhangzhaoju,zhanzhan,zhengzuohe,ZhidanLiu,zhixinaa,zhoufeng,zhouyaqiang0,zhuguodong,zhupuxu,zhuyuxiao,zichun_ye,zjun,zlq2020,zong_shuai,ZPaC,zuochuanyong,zyli2020,陈宇,范吉斌,冯一航,胡彬,宦晓玲,黄勇,雷元哲,李良灿,李林杰,刘崇鸣,刘力力,刘勇琪,吕浩宇,吕昱峰(Nate.River),没有窗户的小巷,沈竞兴,十六夜,王程浩,王禹程,王振邦,徐安越,徐永飞,杨旭华,于振华,俞涵,张清华,张澍坤,张栩浩,张学同,赵英灼,周超,周洪叶,朱家兴 - -欢迎以任何形式对项目提供贡献! - -## MindSpore Lite 2.1.0 Release Notes - -### 主要特性和增强 - -#### MindSpore Lite云侧推理 - -- [STABLE] 支持Ascend硬件后端单卡大模型以及单机多卡分布式大模型高性能推理。 -- [STABLE] Python API Ascend后端支持多模型共享工作空间(Workspace)内存。 -- [STABLE] [通过ModelGroup新增支持多模型共享权重](https://mindspore.cn/lite/docs/zh-CN/r2.1/use/cloud_infer/runtime_cpp.html#%E5%A4%9A%E6%A8%A1%E5%9E%8B%E5%85%B1%E4%BA%AB%E6%9D%83%E9%87%8D),比如大模型场景下全量模型和增量模型共享权重。 - -#### API - -新增ModelGroup [Python](https://www.mindspore.cn/lite/api/zh-CN/r2.1/mindspore_lite/mindspore_lite.ModelGroup.html#mindspore_lite.ModelGroup)和[C++](https://mindspore.cn/lite/api/zh-CN/r2.1/api_cpp/mindspore.html#modelgroup)接口,接口定义如下: - -```python -class ModelGroup - def __init__(self, flags=ModelGroupFlag.SHARE_WORKSPACE) - def add_model(self, models) - def cal_max_size_of_workspace(self, model_type, context) -``` - -```C++ -// class ModelGroup -ModelGroup(ModelGroupFlag flags = ModelGroupFlag::kShareWorkspace); -Status AddModel(const std::vector &model_path_list); -Status AddModel(const std::vector> &model_buff_list); -Status AddModel(const std::vector &model_list); -Status AddModel(const std::vector &model_list); -``` - -## MindSpore 2.0.0 Release Notes - -### 主要特性和增强 - -#### PyNative - -- [Stable] 全面支持动态shape,算子支持度详见[nn接口动态shape支持情况](https://www.mindspore.cn/docs/zh-CN/r2.0/note/dynamic_shape_nn.html)、[ops接口动态shape支持情况](https://www.mindspore.cn/docs/zh-CN/r2.0/note/dynamic_shape_func.html)和[算子动态shape支持情况](https://www.mindspore.cn/docs/zh-CN/r2.0/note/dynamic_shape_primitive.html)。 - -#### AutoParallel - -- [STABLE] 新建MindFormers独立仓,提供分布式并行套件功能,替代mindspore.nn.transformer模块。 -- [DEMO] 分布式Gather算子支持BatchDim属性。 -- [DEMO] 流水线并行支持指定输入数据任意维度作为Batch维。 - -### API变更 - -#### 算子 - -- `mindspore.ops.AdaptiveAvgPool2D` 新增算子原语。 -- `mindspore.ops.BatchToSpaceNDV2` 新增算子原语。 -- `mindspore.ops.CeLU` 新增算子原语。 -- `mindspore.ops.ExtractVolumePatches` 新增算子原语。 -- `mindspore.ops.FFTWithSize` 新增算子原语。 -- `mindspore.ops.FillDiagonal` 新增算子原语。 -- `mindspore.ops.FractionalMaxPool3DWithFixedKsize` 新增算子原语。 -- `mindspore.ops.Im2Col` 新增算子原语。 -- `mindspore.ops.MaskedScatter` 新增算子原语。 -- `mindspore.ops.MatrixBandPart` 新增算子原语。 -- `mindspore.ops.MatrixInverse` 新增算子原语。 -- `mindspore.ops.MaxPoolWithArgmaxV2` 新增算子原语。 -- `mindspore.ops.Ormqr` 新增算子原语。 -- `mindspore.ops.RandpermV2` 新增算子原语。 -- `mindspore.ops.ResizeBicubic` 新增算子原语。 -- `mindspore.ops.Triu` 新增算子原语。 -- `mindspore.ops.Zeta` 新增算子原语。 - -#### 非兼容性接口变更 - -- 接口名称:mindspore.ops.MultitypeFuncGraph - - 变更内容:该接口参数doc_url在MindSpore 2.0.0.rc1版本作为测试特性,MindSpore 2.0.0版本优化后用户不需要额外配置此参数,故此参数在MindSpore 2.0.0版本删除。 - - - - - - - - - -
原接口 v2.0.0 接口
-  mindspore.ops.MultitypeFuncGraph(name, read_value=False, doc_url="")
-  
-
-  mindspore.ops.MultitypeFuncGraph(name, read_value=False)
-  
-
- -- 接口名称:mindspore.set_context(auto_tune_mode="GA,RL") - - 变更内容:下线算子AutoTune调优工具,删除auto_tune_mode选项,未来会规划新的调优工具。 - -- 接口名称:mindspore.set_context(mode=PYNATIVE_MODE) - - 变更内容:默认由GRAPH_MODE改为PYNATIVE_MODE。 - - 说明:原有使用方式若未设置运行模式,该变更会影响性能,需要额外设置图模式,则使用以下方式: - mindspore.set_context(mode=GRAPH_MODE)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  mindspore.set_context(mode=GRAPH_MODE)
-  
-
-  mindspore.set_context(mode=PYNATIVE_MODE)
-  
-
- -- 接口名称:mindspore.train.Model.train - - 变更内容:dataset_sink_mode 默认值由True改为False。 - - 说明:原有使用方式若未设置dataset_sink_mode,该变更会影响性能,需要额外设置数据下沉运行模式,则使用以下方式: - Model.train(dataset_sink_mode=True)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  Model.train(dataset_sink_mode=True)
-  
-
-  Model.train(dataset_sink_mode=False)
-  
-
- -- 接口名称:mindspore.export - - 变更内容:参数file_format由"AIR"改为不指定默认值。 - - 说明:原有使用方式若未设置file_format,需要额外设置file_format,则使用以下方式: - mindspore.export(net, *inputs, file_name, file_format="AIR", **kwargs)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  mindspore.export(net, *inputs, file_name,
-                   file_format="AIR", **kwargs)
-  
-
-  mindspore.export(net, *inputs, file_name,
-                   file_format, **kwargs)
-  
-
- -- 接口名称:mindspore.ops.norm - - 变更内容:扩展ord参数功能,支持多种形式。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.norm(input_x, axis, p=2, keep_dims=False, epsilon=1e-12)
-  >>> # 举例:
-  >>> input = Tensor(np.array([[[1.0, 2.0], [3.0, 4.0]],
-  ...                          [[5.0, 6.0], [7.0, 8.0]]]).astype(np.float32))
-  >>> output = ops.norm(input, [0, 1], p=2)
-  
-  ops.norm(A, ord=None, dim=None, keepdim=False, *, dtype=None)
-  >>> # 举例:
-  >>> input = Tensor(np.array([[[1.0, 2.0], [3.0, 4.0]],
-  ...                          [[5.0, 6.0], [7.0, 8.0]]]).astype(np.float32))
-  >>> output = ops.norm(input, ord=2, dim=(0, 1))
-  
-
- -- 接口名称:mindspore.Tensor.norm - - 变更内容:扩展ord参数功能,支持多种形式。 - - 说明:参考ops.norm例子。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  Tensor.norm(axis, p=2, keep_dims=False, epsilon=1e-12)
-  
-
-  Tensor.norm(ord=None, dim=None, keepdim=False, *, dtype=None)
-  
-
- -- 接口名称:mindspore.ops.dropout - - 变更内容:删除seed0、seed1参数,新增参数seed=None。由返回Tensor和掩码改为只返回Tensor,新增入参training=True。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.dropout(x, p=0.5, seed0=0, seed1=0)
-  >>> # 举例:
-  >>> input = Tensor(((20, 16), (50, 50)),
-  ...                mindspore.float32)
-  >>> output, mask = dropout(x, p=0.5)
-  
-
-  ops.dropout(input, p=0.5, training=True, seed=None)
-  >>> # 举例:
-  >>> input = Tensor(((20, 16), (50, 50)),
-  ...                mindspore.float32)
-  >>> output = ops.dropout(input, p=0.5,training=True)
-  
-
- -- 接口名称:mindspore.ops.dropout2d - - 变更内容:返回值从Tensor和掩码改为只返回Tensor,新增入参training=True。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-
-  ops.dropout2d(x, p=0.5)
-  >>> # 举例:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output, mask = dropout2d(input, 0.5)
-  
-
-
-  ops.dropout2d(input, p=0.5, training=True)
-  >>> # 举例:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output = ops.dropout2d(input, 0.5, training=True)
-  
-
- -- 接口名称:mindspore.ops.dropout3d - - 变更内容:返回值从Tensor和掩码改为只返回Tensor,新增入参training=True。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.dropout3d(x, p=0.5)
-  >>> # 举例:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output, mask = dropout3d(input, 0.5)
-  
-
-  ops.dropout3d(input, p=0.5, training=True)
-  >>> # 举例:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output = ops.dropout3d(input, 0.5, training=True)
-  
-
- -- 接口名称:mindspore.ops.std - - 变更内容:接口重构,接口使用方式更符合用户使用习惯。 - - 说明:原有unbiased如果已显示设置,采用以下替代方案: - ddof=0替代unbiased=False,ddof=1替代unbiased=True。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.std(input_x, axis=(), unbiased=True, keep_dims=False)
-  
-
-  ops.std(input, axis=None, ddof=0, keepdims=False)
-  
-
- -- 接口名称:mindspore.load_param_into_net - - 变更内容:新增ckpt中未加载的参数作为返回值。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  net_param = load_param_into_net()
-  
-
-  net_param, ckpt_param = load_param_into_net()
-  
-
- -- 接口名称:mindspore.nn.BCELoss - - 变更内容:`reduction` 默认值由'none'变为'mean'。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  BCELoss(weight=None, reduction='none')
-  >>> # 举例:
-  >>> weight = Tensor(np.array([[1.0, 2.0, 3.0],
-  ...                           [4.0, 3.3, 2.2]]),
-  ...                 mindspore.float32)
-  >>> loss = nn.BCELoss(weight=weight, reduction='mean')
-  >>> logits = Tensor(np.array([[0.1, 0.2, 0.3],
-  ...                           [0.5, 0.7, 0.9]]),
-  ...                 mindspore.float32)
-  >>> labels = Tensor(np.array([[0, 1, 0], [0, 0, 1]]),
-  ...                 mindspore.float32)
-  >>> output = loss(logits, labels)
-  >>> print(output)
-  >>> 1.8952923
-  
-
-  BCELoss(weight=None, reduction='mean')
-  >>> # 举例:
-  >>> weight = Tensor(np.array([[1.0, 2.0, 3.0],
-  ...                           [4.0, 3.3, 2.2]]),
-  ...                 mindspore.float32)
-  >>> loss = nn.BCELoss(weight=weight)
-  >>> logits = Tensor(np.array([[0.1, 0.2, 0.3],
-  ...                           [0.5, 0.7, 0.9]]),
-  ...                 mindspore.float32)
-  >>> labels = Tensor(np.array([[0, 1, 0], [0, 0, 1]]),
-  ...                 mindspore.float32)
-  >>> output = loss(logits, labels)
-  >>> print(output)
-  >>> 1.8952923
-  
-
- -- 接口名称:mindspore.ops.split - - 变更内容:接口重构,接口使用方式更符合用户使用习惯,调整第2个和第3个参数的顺序,修改并扩展split_size_or_sections功能。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.split(input_x, axis=0, output_num=1)
-  >>> # 举例:
-  >>> input = Tensor(np.array([[1, 1, 1, 1], [2, 2, 2, 2]]),
-  ...                mindspore.int32)
-  >>> output = ops.split(input, axis=1, output_num=4)
-  
-
-  ops.split(tensor, split_size_or_sections, axis=0)
-  >>> # 举例:
-  >>> input = Tensor(np.array([[1, 1, 1, 1], [2, 2, 2, 2]]),
-  ...                mindspore.int32)
-  >>> output = ops.split(input, split_size_or_sections=1, axis=1)
-  
-
- -- 接口名称:mindspore.Tensor.split - - 变更内容:接口重构,接口使用方式更符合用户使用习惯,调整两个参数的位置,修改并扩展split_size_or_sections功能。 - - 说明:参考ops.split例子。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  Tensor.split(axis=0, output_num=1)
-  
-
-  Tensor.split(split_size_or_sections, axis=0)
-  
-
- -- 接口名称:mindspore.ops.pad - - 变更内容:修改参数名paddings为padding,添加mode和value功能。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.pad(input_x, paddings)
-  >>> # 举例:
-  >>> input_x = Tensor(np.array([[-0.1, 0.3, 3.6],
-  ...                            [0.4, 0.5, -3.2]]),
-  ...                  mindspore.float32)
-  >>> paddings = ((1, 2), (2, 1))
-  >>> output = ops.pad(input_x, paddings)
-  
-
-  ops.pad(input_x, padding, mode='constant', value=None)
-  >>> # 举例:
-  >>> input_x = Tensor(np.array([[-0.1, 0.3, 3.6],
-  ...                            [0.4, 0.5, -3.2]]),
-  ...                  mindspore.float32)
-  >>> paddings = (2, 1, 1, 2)
-  >>> output = ops.pad(input_x, paddings)
-  
-
- -- 接口名称:mindspore.ops.meshgrid - - 变更内容:入参由inputs改为*input。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.meshgrid(inputs, indexing='xy')
-  >>> # 举例:
-  >>> x = Tensor(np.array([1, 2, 3, 4]).astype(np.int32))
-  >>> y = Tensor(np.array([5, 6, 7]).astype(np.int32))
-  >>> z = Tensor(np.array([8, 9, 0, 1, 2]).astype(np.int32))
-  >>> output = ops.meshgrid((x, y, z), indexing='xy')
-  
-
-  ops.meshgrid(*inputs, indexing='xy')
-  >>> # 举例:
-  >>> x = Tensor(np.array([1, 2, 3, 4]).astype(np.int32))
-  >>> y = Tensor(np.array([5, 6, 7]).astype(np.int32))
-  >>> z = Tensor(np.array([8, 9, 0, 1, 2]).astype(np.int32))
-  >>> output = ops.meshgrid(x, y, z, indexing='xy')
-  
-
- -- 接口名称:mindspore.ops.max - - 变更内容:返回值调换顺序,由:“下标,最大值”改为“最大值,下标”。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.max(x, axis=0, keep_dims=False)
-  >>> # 举例:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> index, output = ops.max(input)
-  >>> print(index, output)
-  >>> 3 0.7
-  
-
-  ops.max(input, axis=None, keepdims=False, *, initial=None, where=True, return_indices=False)
-  >>> # 举例:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> output, index = ops.max(input, axis=0)
-  >>> print(output, index)
-  
-
- -- 接口名称:mindspore.ops.min - - 变更内容:返回值调换顺序,由:“下标,最小值”改为“最小值,下标”。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.min(x, axis=0, keep_dims=False)
-  >>> # 举例:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> index, output = ops.min(input)
-  >>> 0 0.0
-  
-
-  ops.min(input, axis=None, keepdims=False, *, initial=None, where=True, return_indices=False)
-  >>> # 举例:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> output, index = ops.min(input, keepdims=True)
-  >>> 0.0 0
-  
-
- -- 接口名称:mindspore.ops.random_gamma - - 变更内容:删除seed2参数,seed=0改为None。框架行为统一且符合用户实际使用场景及习惯。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.random_gamma(shape, alpha, seed=0, seed2=0)
-  
-
-  ops.random_gamma(shape, alpha, seed=None)
-  
-
- -- 接口名称:mindspore.ops.standard_laplace - - 变更内容:删除seed2参数,seed=0改为None。框架行为统一且符合用户实际使用场景及习惯。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.standard_laplace(shape, seed=0, seed2=0)
-  
-
-  ops.standard_laplace(shape, seed=None)
-  
-
- -- 接口名称:mindspore.ops.standard_normal - - 变更内容:删除seed2参数,seed=0改为None。框架行为统一且符合用户实际使用场景及习惯。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.standard_normal(shape, seed=0, seed2=0)
-  
-
-  ops.standard_normal(shape, seed=None)
-  
-
- -- 接口名称:mindspore.ops.bernoulli - - 变更内容:seed的默认值由-1改为None。符合用户实际使用场景。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.bernoulli(x, p=0.5, seed=-1)
-  
-
-  ops.bernoulli(input, p=0.5, seed=None)
-  
-
- -- 接口名称:mindspore.data_sink - - 变更内容:删除steps参数,jit参数名称修改为jit_config,新增input_signature参数。增加易用性,符合用户实际使用场景。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  mindspore.data_sink(fn, dataset, steps,
-                      sink_size=1, jit=False)
-  
-
-  mindspore.data_sink(fn, dataset, sink_size=1,
-                      jit_config=None, input_signature=None)
-  
-
- -- 接口名称:mindspore.ops.conv2d - - 变更内容:扩展接口功能,添加bias参数,修改参数名及参数顺序。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  conv2d(inputs, weight, pad_mode="valid",
-         padding=0, stride=1, dilation=1, group=1)
-  
-
-  conv2d(input, weight, bias=None, stride=1,
-         pad_mode="valid", padding=0, dilation=1, groups=1)
-  
-
- -- 接口名称:mindspore.dataset.vision.Pad - - 变更内容:调整Pad、RandomCrop、RandomCropWithBbox入参padding,当Padding输入长度为2的序列时,行为将从使用第一个值填充左/上边界,使用第二个值填充右/下边界,变为使用第一个值填充左/右边界,使用第二个值填充上/下边界。 - - 说明:仅使用size为2的padding参数无法兼容旧版本的效果,需显式表示(左、右、上、下)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  mindspore.dataset.vision.Pad(padding=(1,2))
-  代表图片的左/上填充 1像素,右/下填充 2像素
-  
-
-  mindspore.dataset.vision.Pad(padding=(1,2,1,2))
-  代表图片的左/上填充 1像素,右/下填充 2像素
-  
-
- -- 接口名称:mindspore.dataset.Dataset.map - - 变更内容:删除column_order参数。因为在绝大部分的情况下,output_columns参数与column_order参数都是同一个值,不需要再传入column_order。若需要调整数据列顺序,使用mindspore.dataset.Dataset.project实现。 - - 说明: - - 1) 在不需要改变列顺序时,直接去掉column_order参数即可。 - 2) 需要指定数据列顺序时,删除column_order参数,并在后面加上一个project方法进行列变换(如下面的例子)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  >>> dataset = dataset.map(operations=[transforms],
-  ...                       input_columns=["column_a"],
-  ...                       output_columns=["column_b", "column_c"],
-  ...                       column_order=["column_c", "column_b"])
-  
-
-  >>> dataset = dataset.map(operations=[transforms],
-  ...                       input_columns=["column_a"],
-  ...                       output_columns=["column_b", "column_c"])
-  >>> dataset = dataset.project(["column_c", column_b"])")
-  
-
- -- 接口名称:mindspore.dataset.Dataset.batch - - 变更内容:删除column_order参数。因为在绝大部分的情况下,output_columns参数与column_order参数都是同一个值,不需要再传入column_order。若需要调整数据列顺序,使用mindspore.dataset.Dataset.project实现。 - - 说明: - - 1) 在不需要改变列顺序时,直接去掉column_order参数即可。 - 2) 需要指定数据列顺序时,删除column_order参数,并在后面加上一个project方法进行列变换(如下面的例子)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  >>> dataset = dataset.batch(batch_size=4,
-  ...                         input_columns=["column_a"],
-  ...                         output_columns=["column_b", "column_c"],
-  ...                         column_order=["column_c", "column_b"])
-  
-
-  >>> dataset = dataset.batch(batch_size=4, input_columns=["column_a"]
-  ...                         output_columns=["column_b", "column_c"])
-  >>> dataset = dataset.project(["column_c", column_b"])")
-  
-
- -- 接口名称:mindspore.dataset.Dataset.batch - - 变更内容:将batch方法拆分为:batch和padded_batch两个方法。pad_info参数从batch方法移动到padded_batch方法。 - - 说明:如需使用pad_info参数,改用padded_batch方法。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  >>> dataset = dataset.batch(batch_size=4,
-  ...                         drop_remainder=True, pad_info=...)
-  
-
-  >>> dataset = dataset.padded_batch(batch_size=4,
-  ...                                drop_remainder=True, pad_info=...)
-  
-
- -### Bug fixes - -- [I62I3J] 修复bgcf网络在昇腾310上推理失败的问题 -- [I7C2W3] 修复Pipeline并行场景下多loss打印编译失败问题 - -### 贡献者 - -感谢以下人员做出的贡献: - -alashkari,anzhengqi,archer2049,B.L.LAN,baihuawei,bichaoyang,BJ-WANG,Bokai Li,Brian-K,caifubi,caiyimeng,cathwong,changzherui,ChenDonYY,chenfei_mindspore,chengang,chengbin,chenhaozhe,chenjianping,chenkang,chenweifeng,chuht,chujinjin,davidanugraha,DavidFFFan,DeshiChen,douzhixing,emmmmtang,Erpim,Ethan,fangwenyi,fangzehua,fangzhou0329,fary86,fengyixing,gaoshuanglong,Gaoxiong,gaoyong10,gengdongjie,gongdaguo1,Greatpan,GuoZhibin,guozhijian,hangq,hanhuifeng,haozhang,hedongdong,Henry Shi,heterogeneous_to_backoff_2_0,huangbingjian,huanghui,huangxinjing,hujiahui8,hujingsong,huoxinyou,jachua,jiahongQian,jianghui58,jiangzhenguang,jiaorui,jiaoy1224,jijiarong,jjfeing,JoeyLin,json,JuiceZ,jxl,kairui_kou,KevinYi,kisnwang,KXiong,laiyongqiang,lanzhineng,liangchenghui,liangzelang,LiangZhibo,lianliguang,lichen,ligan,lijunbin,limingqi107,ling,linqingke,liubuyu,liuchao,liuchuting,liujunzhu,liuluobin,liutongtong9,liuyang811,lixiao,liyan2022,liyejun,liyuxia,looop5,luochao60,luojianing,luoyang,luoyuan,lyqlola,maning202007,maoyaomin,Margaret_wangrui,mayadong,MaZhiming,melody,mengyuanli,michaelzhu_70ab,Mohammad Motallebi,moran,NaCN,nomindcarry,OwenSec,panfengfeng,panshaowu,panzhihui,pkuliuliu,qinzheng,qiuzhongya,qujianwei,r1chardf1d0,Renyuan Zhang,RobinGrosman,shaojunsong,shenwei41,Soaringfish,tangdezhi_123,tanghuikang,tan-wei-cheng,TinaMengtingZhang,TronZhang,TuDouNi,VectorSL,wang_ziqi,wanghenchang,wangnan39,wangpingan,wangshaocong,wangshengnan123,wangtongyu6,weichaoran,wind-zyx,wqx,wtcheng,wujueying,wYann,XianglongZeng,xiaohanzhang,xiaotianci,xiaoyao,XinDu,xulei,xumengjuan1,xupan,xwkgch,yanghaoran,yangluhang,yangruoqi713,yangshuo,yangsijia,yangzhenzhang,yanzhenxiang2020,Yanzhi_YI,yao_yf,yefeng,yeyunpeng2020,Yi_zhang95,yide12,YijieChen,YingLai Lin,YingtongHu,youshu,yuchaojie,yuedongli,YuJianfeng,zangqx,ZengZitao,zhangbuxue,zhangdanyang,zhangdong,zhangfanghe,zhangqi,zhangqinghua,zhangyanhui,zhangyinxia,zhangyongxian,zhangzhaoju,zhanzhan,zhengzuohe,ZhidanLiu,zhixinaa,zhoufeng,zhouyaqiang0,zhuguodong,zhupuxu,zhuyuxiao,zichun_ye,zjun,zlq2020,zong_shuai,ZPaC,zuochuanyong,zyli2020,陈宇,范吉斌,冯一航,胡彬,宦晓玲,黄勇,雷元哲,李良灿,李林杰,刘崇鸣,刘力力,刘勇琪,吕浩宇,吕昱峰(Nate.River),没有窗户的小巷,沈竞兴,十六夜,王程浩,王禹程,王振邦,徐安越,徐永飞,杨旭华,于振华,俞涵,张清华,张澍坤,张栩浩,张学同,赵英灼,周超,周洪叶,朱家兴 - -欢迎以任何形式对项目提供贡献! - -## MindSpore 2.0.0-rc1 Release Notes - -### 主要特性和增强 - -#### FrontEnd - -- [BETA] 静态图模式下,函数及类方法支持"return None"、"return"、无"return"语法。 -- [BETA] 静态图模式下,支持返回list类型对象。 -- [BETA] 静态图模式下,变量条件时,支持"raise"语法。 -- [STABLE] 函数式调用支持数据下沉模式。 -- [BETA] nn下新增Transformer层,提供更加易用的Transformer API,无需定义batch_size,支持动态seq_length。 - -#### DataSet - -- [STABLE] Ascend环境下,数据下沉模式超时等待时间调整,默认调整到1900s,以解决数据下沉模式时因环境资源竞争、计算量大等因素容易导致GetNext算子等待超时的问题。 -- [STABLE] MindRecord提供Schema、样本数查询接口,并提供多进程并行写入功能,允许用户更快生成MindRecord数据文件。 -- [STABLE] Dataset流水线支持处理任意Python对象,用法参考[数据pipeline支持Python对象](https://www.mindspore.cn/tutorials/zh-CN/r2.0/advanced/dataset/python_objects.html)。 - -#### AutoParallel - -- [STABLE] 策略保存时支持保存完整策略。 -- [STABLE] 支持Conv3D/MaxPool3D/AvgPool3D分布式算子。 -- [STABLE] 支持PyNative+shard算子级并行+优化器并行:并行表达和Model进行解耦,提供基础的并行表达能力。 -- [STABLE] 支持图模式算子级并行+优化器并行:并行表达和Model进行解耦,提供基础的并行表达能力。 -- [BETA] 支持自定义分布式图切分,提升分布式训练的灵活性。 - -#### Runtime - -- [STABLE] 控制流支持子图下沉。 -- [STABLE] 支持CUDA 11.6。 -- [STABLE] 支持List/Tuple/Scalar类型算子的算子选择和执行,配套Python原生表达。 -- [STABLE] 硬件不支持的算子自动选择CPU算子。 -- [STABLE] 支持子图内部异构执行。 - -#### Ascend - -- [STABLE] 支持CANN溢出检测新方案和HCCL运行态溢出检测。 -- [STABLE] 支持集合通信算子dump功能。 - -#### Profiler - -- [STABLE] 丰富Profiler采集项配置,用户可以更细度地采集性能数据。 - -#### Dump - -- [BETA] 单卡PyNatvie模式支持算子溢出检测。 -- [BETA] Graph模式支持hccl算子dump。 - -### API变更 - -- [STABLE] 新增计算类API,如:MaxUnpool、ReplicationPad、GaussianNLLLoss等。 - 详情请参考:。 -- [STABLE] 扩展存量API功能,如:AvgPool、pad、norm、interplate等。 - -#### 算子 - -- [BETA] `mindspore.ops.AdaptiveAvgPool3D` 新增算子原语。 -- [BETA] `mindspore.ops.AffineGrid` 新增算子原语。 -- [BETA] `mindspore.ops.Angle` 新增算子原语。 -- [BETA] `mindspore.ops.BartlettWindow` 新增算子原语。 -- [BETA] `mindspore.ops.Bernoulli` 新增算子原语。 -- [BETA] `mindspore.ops.BesselI0` 新增算子原语。 -- [BETA] `mindspore.ops.BesselI1` 新增算子原语。 -- [BETA] `mindspore.ops.BesselJ0` 新增算子原语。 -- [BETA] `mindspore.ops.BesselJ1` 新增算子原语。 -- [BETA] `mindspore.ops.BesselK0` 新增算子原语。 -- [BETA] `mindspore.ops.BesselK0e` 新增算子原语。 -- [BETA] `mindspore.ops.BesselK1` 新增算子原语。 -- [BETA] `mindspore.ops.BesselK1e` 新增算子原语。 -- [BETA] `mindspore.ops.BesselY0` 新增算子原语。 -- [BETA] `mindspore.ops.BesselY1` 新增算子原语。 -- [BETA] `mindspore.ops.Bincount` 新增算子原语。 -- [BETA] `mindspore.ops.BlackmanWindow` 新增算子原语。 -- [BETA] `mindspore.ops.ChannelShuffle` 新增算子原语。 -- [BETA] `mindspore.ops.Cholesky` 新增算子原语。 -- [BETA] `mindspore.ops.Col2Im` 新增算子原语。 -- [BETA] `mindspore.ops.Complex` 新增算子原语。 -- [BETA] `mindspore.ops.ComplexAbs` 新增算子原语。 -- [BETA] `mindspore.ops.Cross` 新增算子原语。 -- [BETA] `mindspore.ops.CTCLossV2` 新增算子原语。 -- [BETA] `mindspore.ops.Cummin` 新增算子原语。 -- [BETA] `mindspore.ops.Diag` 新增算子原语。 -- [BETA] `mindspore.ops.Digamma` 新增算子原语。 -- [BETA] `mindspore.ops.Expand` 新增算子原语。 -- [BETA] `mindspore.ops.Fmax` 新增算子原语。 -- [BETA] `mindspore.ops.Gcd` 新增算子原语。 -- [BETA] `mindspore.ops.Geqrf` 新增算子原语。 -- [BETA] `mindspore.ops.GLU` 新增算子原语。 -- [BETA] `mindspore.ops.GridSampler2D` 新增算子原语。 -- [BETA] `mindspore.ops.GridSampler3D` 新增算子原语。 -- [BETA] `mindspore.ops.HammingWindow` 新增算子原语。 -- [BETA] `mindspore.ops.Heaviside` 新增算子原语。 -- [BETA] `mindspore.ops.Hypot` 新增算子原语。 -- [BETA] `mindspore.ops.Igamma` 新增算子原语。 -- [BETA] `mindspore.ops.IndexFill` 新增算子原语。 -- [BETA] `mindspore.ops.InplaceIndexAdd` 新增算子原语。 -- [BETA] `mindspore.ops.InplaceUpdateV2` 新增算子原语。 -- [BETA] `mindspore.ops.Lcm` 新增算子原语。 -- [BETA] `mindspore.ops.LeftShift` 新增算子原语。 -- [BETA] `mindspore.ops.LogicalXor` 新增算子原语。 -- [BETA] `mindspore.ops.Logit` 新增算子原语。 -- [BETA] `mindspore.ops.LogSpace` 新增算子原语。 -- [BETA] `mindspore.ops.LuUnpack` 新增算子原语。 -- [BETA] `mindspore.ops.MatrixDiagPartV3` 新增算子原语。 -- [BETA] `mindspore.ops.MatrixDiagV3` 新增算子原语。 -- [BETA] `mindspore.ops.MatrixSetDiagV3` 新增算子原语。 -- [BETA] `mindspore.ops.MaxPool3DWithArgmax` 新增算子原语。 -- [BETA] `mindspore.ops.MaxUnpool2D` 新增算子原语。 -- [BETA] `mindspore.ops.MaxUnpool3D` 新增算子原语。 -- [BETA] `mindspore.ops.MultiMarginLoss` 新增算子原语。 -- [BETA] `mindspore.ops.MultinomialWithReplacement` 新增算子原语。 -- [BETA] `mindspore.ops.Mvlgamma` 新增算子原语。 -- [BETA] `mindspore.ops.NanToNum` 新增算子原语。 -- [BETA] `mindspore.ops.NextAfter` 新增算子原语。 -- [BETA] `mindspore.ops.Orgqr` 新增算子原语。 -- [BETA] `mindspore.ops.Polygamma` 新增算子原语。 -- [BETA] `mindspore.ops.ResizeBilinearV2` 新增算子原语。 -- [BETA] `mindspore.ops.RightShift` 新增算子原语。 -- [BETA] `mindspore.ops.ScatterNdDiv` 新增算子原语。 -- [BETA] `mindspore.ops.ScatterNdMul` 新增算子原语。 -- [BETA] `mindspore.ops.SearchSorted` 新增算子原语。 -- [BETA] `mindspore.ops.Sinc` 新增算子原语。 -- [BETA] `mindspore.ops.Trace` 新增算子原语。 -- [BETA] `mindspore.ops.Tril` 新增算子原语。 -- [BETA] `mindspore.ops.TrilIndices` 新增算子原语。 -- [BETA] `mindspore.ops.TriuIndices` 新增算子原语。 -- [BETA] `mindspore.ops.UniqueConsecutive` 新增算子原语。 -- [STABLE] `mindspore.ops.Cummax` 新增算子原语。 -- [STABLE] `mindspore.ops.FillV2` 新增算子原语。 -- [STABLE] `mindspore.ops.IsClose` 新增算子原语。 -- [STABLE] `mindspore.ops.MatrixSolve` 新增算子原语。 -- [STABLE] `mindspore.ops.Median` 新增算子原语。 -- [STABLE] `mindspore.ops.MultilabelMarginLoss` 新增算子原语。 -- [STABLE] `mindspore.ops.NonZero` 新增算子原语。 -- [STABLE] `mindspore.ops.Pdist` 新增算子原语。 -- [STABLE] `mindspore.ops.Polar` 新增算子原语。 -- [STABLE] `mindspore.ops.RandomGamma` 新增算子原语。 -- [STABLE] `mindspore.ops.RandomPoisson` 新增算子原语。 -- [STABLE] `mindspore.ops.RandomShuffle` 新增算子原语。 -- [STABLE] `mindspore.ops.Renorm` 新增算子原语。 -- [STABLE] `mindspore.ops.ScatterNdMax` 新增算子原语。 -- [STABLE] `mindspore.ops.ScatterNdMin` 新增算子原语。 -- [STABLE] `mindspore.ops.Svd` 新增算子原语。 -- [STABLE] `mindspore.ops.TripletMarginLoss` 新增算子原语。 - -#### 删除接口 - -- `mindspore.compression`特性在MindSpore 1.8版本已经废弃,在当前版本被删除。用户可以使用[昇思金箍棒](https://gitee.com/mindspore/golden-stick)作为`mindspore.compression`的替代品来实现MindSpore中的量化感知训练算法。 -- `mindspore.dataset.close_pool`、`mindspore.dataset.to_device`、`mindspore.dataset.set_dynamic_columns` 接口在之前版本已废弃,当前版本正式删除。 - -#### 非兼容性接口变更 - -- 接口名称:mindspore.set_context(mode=PYNATIVE_MODE) - - 变更内容:默认由GRAPH_MODE改为PYNATIVE_MODE。 - - 说明:原有使用方式若未设置运行模式,该变更会影响性能,需要额外设置图模式,则使用以下方式: - mindspore.set_context(mode=GRAPH_MODE)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  mindspore.set_context(mode=GRAPH_MODE)
-  
-
-  mindspore.set_context(mode=PYNATIVE_MODE)
-  
-
- -- 接口名称:mindspore.train.Model.train - - 变更内容:dataset_sink_mode 默认值由True改为False。 - - 说明:原有使用方式若未设置dataset_sink_mode,该变更会影响性能,需要额外设置数据下沉运行模式,则使用以下方式: - Model.train(dataset_sink_mode=True)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  Model.train(dataset_sink_mode=True)
-  
-
-  Model.train(dataset_sink_mode=False)
-  
-
- -- 接口名称:mindspore.export - - 变更内容:参数file_format由"AIR"改为不指定默认值。 - - 说明:原有使用方式若未设置file_format,需要额外设置file_format,则使用以下方式: - mindspore.export(net, *inputs, file_name, file_format="AIR", **kwargs)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  mindspore.export(net, *inputs, file_name,
-                   file_format="AIR", **kwargs)
-  
-
-  mindspore.export(net, *inputs, file_name,
-                   file_format, **kwargs)
-  
-
- -- 接口名称:mindspore.ops.norm - - 变更内容:扩展ord参数功能,支持多种形式。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.norm(input_x, axis, p=2, keep_dims=False, epsilon=1e-12)
-  >>> # 举例:
-  >>> input = Tensor(np.array([[[1.0, 2.0], [3.0, 4.0]],
-  ...                          [[5.0, 6.0], [7.0, 8.0]]]).astype(np.float32))
-  >>> output = ops.norm(input, [0, 1], p=2)
-  
-
-  ops.norm(A, ord=None, dim=None, keepdim=False, *, dtype=None)
-  >>> # 举例:
-  >>> input = Tensor(np.array([[[1.0, 2.0], [3.0, 4.0]],
-  ...                          [[5.0, 6.0], [7.0, 8.0]]]).astype(np.float32))
-  >>> output = ops.norm(input, ord=2, dim=(0, 1))
-  
-
- -- 接口名称:mindspore.Tensor.norm - - 变更内容:扩展ord参数功能,支持多种形式。 - - 说明:参考ops.norm例子。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  Tensor.norm(axis, p=2, keep_dims=False, epsilon=1e-12)
-  
-
-  Tensor.norm(ord=None, dim=None, keepdim=False, *, dtype=None)
-  
-
- -- 接口名称:mindspore.ops.dropout - - 变更内容:删除seed0、seed1参数,新增参数seed=None。由返回Tensor和掩码改为只返回Tensor,新增入参training=True。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-
-  ops.dropout(x, p=0.5, seed0=0, seed1=0)
-  >>> # 举例:
-  >>> input = Tensor(((20, 16), (50, 50)),
-  ...                mindspore.float32)
-  >>> output, mask = dropout(x, p=0.5)
-  
-
-
-  ops.dropout(input, p=0.5, training=True, seed=None)
-  >>> # 举例:
-  >>> input = Tensor(((20, 16), (50, 50)),
-  ...                mindspore.float32)
-  >>> output = ops.dropout(input, p=0.5,training=True)
-  
-
- -- 接口名称:mindspore.ops.dropout2d - - 变更内容:返回值从Tensor和掩码改为只返回Tensor,新增入参training=True。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-
-  ops.dropout2d(x, p=0.5)
-  >>> # 举例:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output, mask = dropout2d(input, 0.5)
-  
-
-
-  ops.dropout2d(input, p=0.5, training=True)
-  >>> # 举例:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output = ops.dropout2d(input, 0.5, training=True)
-  
-
- -- 接口名称:mindspore.ops.dropout3d - - 变更内容:返回值从Tensor和掩码改为只返回Tensor,新增入参training=True。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.dropout3d(x, p=0.5)
-  >>> # 举例:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output, mask = dropout3d(input, 0.5)
-  
-
-  ops.dropout3d(input, p=0.5, training=True)
-  >>> # 举例:
-  >>> input = Tensor(np.ones([2, 1, 2, 3]),
-  ...                mindspore.float32)
-  >>> output = ops.dropout3d(input, 0.5, training=True)
-  
-
- -- 接口名称:mindspore.ops.std - - 变更内容:接口重构,接口使用方式更符合用户使用习惯。 - - 说明:原有unbiased如果已显示设置,采用以下替代方案: - ddof=0替代unbiased=False,ddof=1替代unbiased=True。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.std(input_x, axis=(), unbiased=True, keep_dims=False)
-  
-
-  ops.std(input, axis=None, ddof=0, keepdims=False)
-  
-
- -- 接口名称:mindspore.load_param_into_net - - 变更内容:新增ckpt中未加载的参数作为返回值。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  net_param = load_param_into_net()
-  
-
-  net_param, ckpt_param = load_param_into_net()
-  
-
- -- 接口名称:mindspore.nn.BCELoss - - 变更内容:`reduction` 默认值由'none'变为'mean'。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  BCELoss(weight=None, reduction='none')
-  >>> # 举例:
-  >>> weight = Tensor(np.array([[1.0, 2.0, 3.0],
-  ...                           [4.0, 3.3, 2.2]]),
-  ...                 mindspore.float32)
-  >>> loss = nn.BCELoss(weight=weight, reduction='mean')
-  >>> logits = Tensor(np.array([[0.1, 0.2, 0.3],
-  ...                           [0.5, 0.7, 0.9]]),
-  ...                 mindspore.float32)
-  >>> labels = Tensor(np.array([[0, 1, 0], [0, 0, 1]]),
-  ...                 mindspore.float32)
-  >>> output = loss(logits, labels)
-  >>> print(output)
-  >>> 1.8952923
-  
-
-  BCELoss(weight=None, reduction='mean')
-  >>> # 举例:
-  >>> weight = Tensor(np.array([[1.0, 2.0, 3.0],
-  ...                           [4.0, 3.3, 2.2]]),
-  ...                 mindspore.float32)
-  >>> loss = nn.BCELoss(weight=weight)
-  >>> logits = Tensor(np.array([[0.1, 0.2, 0.3],
-  ...                           [0.5, 0.7, 0.9]]),
-  ...                 mindspore.float32)
-  >>> labels = Tensor(np.array([[0, 1, 0], [0, 0, 1]]),
-  ...                 mindspore.float32)
-  >>> output = loss(logits, labels)
-  >>> print(output)
-  >>> 1.8952923
-  
-
- -- 接口名称:mindspore.ops.split - - 变更内容:接口重构,接口使用方式更符合用户使用习惯,调整第2个和第3个参数的顺序,修改并扩展split_size_or_sections功能。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.split(input_x, axis=0, output_num=1)
-  >>> # 举例:
-  >>> input = Tensor(np.array([[1, 1, 1, 1], [2, 2, 2, 2]]),
-  ...                mindspore.int32)
-  >>> output = ops.split(input, axis=1, output_num=4)
-  
-
-  ops.split(tensor, split_size_or_sections, axis=0)
-  >>> # 举例:
-  >>> input = Tensor(np.array([[1, 1, 1, 1], [2, 2, 2, 2]]),
-  ...                mindspore.int32)
-  >>> output = ops.split(input, split_size_or_sections=1, axis=1)
-  
-
- -- 接口名称:mindspore.Tensor.split - - 变更内容:接口重构,接口使用方式更符合用户使用习惯,调整两个参数的位置,修改并扩展split_size_or_sections功能。 - - 说明:参考ops.split例子。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  Tensor.split(axis=0, output_num=1)
-  
-
-  Tensor.split(split_size_or_sections, axis=0)
-  
-
- -- 接口名称:mindspore.ops.pad - - 变更内容:修改参数名paddings为padding,添加mode和value功能。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.pad(input_x, paddings)
-  >>> # 举例:
-  >>> input_x = Tensor(np.array([[-0.1, 0.3, 3.6],
-  ...                            [0.4, 0.5, -3.2]]),
-  ...                  mindspore.float32)
-  >>> paddings = ((1, 2), (2, 1))
-  >>> output = ops.pad(input_x, paddings)
-  
-
-  ops.pad(input_x, padding, mode='constant', value=None)
-  >>> # 举例:
-  >>> input_x = Tensor(np.array([[-0.1, 0.3, 3.6],
-  ...                            [0.4, 0.5, -3.2]]),
-  ...                  mindspore.float32)
-  >>> paddings = (2, 1, 1, 2)
-  >>> output = ops.pad(input_x, paddings)
-  
-
- -- 接口名称:mindspore.ops.meshgrid - - 变更内容:入参由inputs改为*input。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.meshgrid(inputs, indexing='xy')
-  >>> # 举例:
-  >>> x = Tensor(np.array([1, 2, 3, 4]).astype(np.int32))
-  >>> y = Tensor(np.array([5, 6, 7]).astype(np.int32))
-  >>> z = Tensor(np.array([8, 9, 0, 1, 2]).astype(np.int32))
-  >>> output = ops.meshgrid((x, y, z), indexing='xy')
-  
-
-  ops.meshgrid(*inputs, indexing='xy')
-  >>> # 举例:
-  >>> x = Tensor(np.array([1, 2, 3, 4]).astype(np.int32))
-  >>> y = Tensor(np.array([5, 6, 7]).astype(np.int32))
-  >>> z = Tensor(np.array([8, 9, 0, 1, 2]).astype(np.int32))
-  >>> output = ops.meshgrid(x, y, z, indexing='xy')
-  
-
- -- 接口名称:mindspore.ops.max - - 变更内容:返回值调换顺序,由:“下标,最大值”改为“最大值,下标”。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.max(x, axis=0, keep_dims=False)
-  >>> # 举例:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> index, output = ops.max(input)
-  >>> print(index, output)
-  >>> 3 0.7
-  
-
-  ops.max(input, axis=None, keepdims=False, *, initial=None, where=True, return_indices=False)
-  >>> # 举例:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> output, index = ops.max(input, axis=0)
-  >>> print(output, index)
-  
-
- -- 接口名称:mindspore.ops.min - - 变更内容:返回值调换顺序,由:“下标,最小值”改为“最小值,下标”。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.min(x, axis=0, keep_dims=False)
-  >>> # 举例:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> index, output = ops.min(input)
-  >>> 0 0.0
-  
-
-  ops.min(input, axis=None, keepdims=False, *, initial=None, where=True, return_indices=False)
-  >>> # 举例:
-  >>> input = Tensor(np.array([0.0, 0.4, 0.6, 0.7, 0.1]),
-  ...                mindspore.float32)
-  >>> output, index = ops.min(input, keepdims=True)
-  >>> 0.0 0
-  
-
- -- 接口名称:mindspore.ops.random_gamma - - 变更内容:删除seed2参数,seed=0改为None。框架行为统一且符合用户实际使用场景及习惯。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.random_gamma(shape, alpha, seed=0, seed2=0)
-  
-
-  ops.random_gamma(shape, alpha, seed=None)
-  
-
- -- 接口名称:mindspore.ops.standard_laplace - - 变更内容:删除seed2参数,seed=0改为None。框架行为统一且符合用户实际使用场景及习惯。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.standard_laplace(shape, seed=0, seed2=0)
-  
-
-  ops.standard_laplace(shape, seed=None)
-  
-
- -- 接口名称:mindspore.ops.standard_normal - - 变更内容:删除seed2参数,seed=0改为None。框架行为统一且符合用户实际使用场景及习惯。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.standard_normal(shape, seed=0, seed2=0)
-  
-
-  ops.standard_normal(shape, seed=None)
-  
-
- -- 接口名称:mindspore.ops.bernoulli - - 变更内容:seed的默认值由-1改为None。符合用户实际使用场景。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  ops.bernoulli(x, p=0.5, seed=-1)
-  
-
-  ops.bernoulli(input, p=0.5, seed=None)
-  
-
- -- 接口名称:mindspore.data_sink - - 变更内容:删除steps参数,jit参数名称修改为jit_config,新增input_signature参数。增加易用性,符合用户实际使用场景。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  mindspore.data_sink(fn, dataset, steps,
-                      sink_size=1, jit=False)
-  
-
-  mindspore.data_sink(fn, dataset, sink_size=1,
-                      jit_config=None, input_signature=None)
-  
-
- -- 接口名称:mindspore.ops.conv2d - - 变更内容:扩展接口功能,添加bias参数,修改参数名及参数顺序。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  conv2d(inputs, weight, pad_mode="valid",
-         padding=0, stride=1, dilation=1, group=1)
-  
-
-  conv2d(input, weight, bias=None, stride=1,
-         pad_mode="valid", padding=0, dilation=1, groups=1)
-  
-
- -- 接口名称:mindspore.dataset.vision.Pad - - 变更内容:调整Pad、RandomCrop、RandomCropWithBbox入参padding,当Padding输入长度为2的序列时,行为将从使用第一个值填充左/上边界,使用第二个值填充右/下边界,变为使用第一个值填充左/右边界,使用第二个值填充上/下边界。 - - 说明:仅使用size为2的padding参数无法兼容旧版本的效果,需显式表示(左、右、上、下)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  mindspore.dataset.vision.Pad(padding=(1,2))
-  代表图片的左/上填充 1像素,右/下填充 2像素
-  
-
-  mindspore.dataset.vision.Pad(padding=(1,2,1,2))
-  代表图片的左/上填充 1像素,右/下填充 2像素
-  
-
- -- 接口名称:mindspore.dataset.Dataset.map - - 变更内容:删除column_order参数。因为在绝大部分的情况下,output_columns参数与column_order参数都是同一个值,不需要再传入column_order。若需要调整数据列顺序,使用mindspore.dataset.Dataset.project实现。 - - 说明: - - 1) 在不需要改变列顺序时,直接去掉column_order参数即可。 - 2) 需要指定数据列顺序时,删除column_order参数,并在后面加上一个project方法进行列变换(如下面的例子)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  >>> dataset = dataset.map(operations=[transforms],
-  ...                       input_columns=["column_a"],
-  ...                       output_columns=["column_b", "column_c"],
-  ...                       column_order=["column_c", "column_b"])
-  
-
-  >>> dataset = dataset.map(operations=[transforms],
-  ...                       input_columns=["column_a"],
-  ...                       output_columns=["column_b", "column_c"])
-  >>> dataset = dataset.project(["column_c", column_b"])")
-  
-
- -- 接口名称:mindspore.dataset.Dataset.batch - - 变更内容:删除column_order参数。因为在绝大部分的情况下,output_columns参数与column_order参数都是同一个值,不需要再传入column_order。若需要调整数据列顺序,使用mindspore.dataset.Dataset.project实现。 - - 说明: - - 1) 在不需要改变列顺序时,直接去掉column_order参数即可。 - 2) 需要指定数据列顺序时,删除column_order参数,并在后面加上一个project方法进行列变换(如下面的例子)。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  >>> dataset = dataset.batch(batch_size=4,
-  ...                         input_columns=["column_a"],
-  ...                         output_columns=["column_b", "column_c"],
-  ...                         column_order=["column_c", "column_b"])
-  
-
-  >>> dataset = dataset.batch(batch_size=4, input_columns=["column_a"]
-  ...                         output_columns=["column_b", "column_c"])
-  >>> dataset = dataset.project(["column_c", column_b"])")
-  
-
- -- 接口名称:mindspore.dataset.Dataset.batch - - 变更内容:将batch方法拆分为:batch和padded_batch两个方法。pad_info参数从batch方法移动到padded_batch方法。 - - 说明:如需使用pad_info参数,改用padded_batch方法。 - - - - - - - - - -
原接口 v2.0.0-rc1接口
-  >>> dataset = dataset.batch(batch_size=4,
-  ...                         drop_remainder=True, pad_info=...)
-  
-
-  >>> dataset = dataset.padded_batch(batch_size=4,
-  ...                                drop_remainder=True, pad_info=...)
-  
-
- -### Bug fixes +## MindSpore Lite 2.6.0 Release Notes -- [I66PE6] 修复 AssignSub算子异常入参导致core dump的问题。 +### 主要特性及增强 -- [I6F5E6] 修复 data_sink 方法在Ascend上执行超时的问题。 +- [STABLE] MindSpore Lite支持模型转换时配置算子并行推理加速,只需在模型转换时配置stream_label_file选项,指定需要进行并行推理的算子。 +- [STABLE] MindSpore Lite支持在昇腾后端下转换onnx控制流中的if算子。 -### 其它 +### API 变更 -- Windows系统支持由于还在优化中,rc版本暂不支持,将在2.0正式版本提供下载。 +- [STABLE] acl模型转换配置中,ascend_context选项下新增stream_label_file选项,用于启用多流并行。 ### 贡献者 -感谢以下人员做出的贡献: - -alashkari,anzhengqi,archer2049,B.L.LAN,baihuawei,bichaoyang,BJ-WANG,Bokai Li,Brian-K,caifubi,caiyimeng,cathwong,changzherui,ChenDonYY,chenfei_mindspore,chengang,chengbin,chenhaozhe,chenjianping,chenkang,chenweifeng,chuht,chujinjin,davidanugraha,DavidFFFan,DeshiChen,douzhixing,emmmmtang,Erpim,Ethan,fangwenyi,fangzehua,fangzhou0329,fary86,fengyixing,gaoshuanglong,Gaoxiong,gaoyong10,gengdongjie,gongdaguo1,Greatpan,GuoZhibin,guozhijian,hangq,hanhuifeng,haozhang,hedongdong,Henry Shi,heterogeneous_to_backoff_2_0,huangbingjian,huanghui,huangxinjing,hujiahui8,hujingsong,huoxinyou,jachua,jiahongQian,jianghui58,jiangzhenguang,jiaorui,jiaoy1224,jijiarong,jjfeing,JoeyLin,json,JuiceZ,jxl,kairui_kou,KevinYi,kisnwang,KXiong,laiyongqiang,lanzhineng,liangchenghui,liangzelang,LiangZhibo,lianliguang,lichen,ligan,lijunbin,limingqi107,ling,linqingke,liubuyu,liuchao,liuchuting,liujunzhu,liuluobin,liutongtong9,liuyang811,lixiao,liyan2022,liyejun,liyuxia,looop5,luochao60,luojianing,luoyang,luoyuan,lyqlola,maning202007,maoyaomin,Margaret_wangrui,mayadong,MaZhiming,melody,mengyuanli,michaelzhu_70ab,Mohammad Motallebi,moran,NaCN,nomindcarry,OwenSec,panfengfeng,panshaowu,panzhihui,pkuliuliu,qinzheng,qiuzhongya,qujianwei,r1chardf1d0,Renyuan Zhang,RobinGrosman,shaojunsong,shenwei41,Soaringfish,tangdezhi_123,tanghuikang,tan-wei-cheng,TinaMengtingZhang,TronZhang,TuDouNi,VectorSL,wang_ziqi,wanghenchang,wangnan39,wangpingan,wangshaocong,wangshengnan123,wangtongyu6,weichaoran,wind-zyx,wqx,wtcheng,wujueying,wYann,XianglongZeng,xiaohanzhang,xiaotianci,xiaoyao,XinDu,xulei,xumengjuan1,xupan,xwkgch,yanghaoran,yangluhang,yangruoqi713,yangshuo,yangsijia,yangzhenzhang,yanzhenxiang2020,Yanzhi_YI,yao_yf,yefeng,yeyunpeng2020,Yi_zhang95,yide12,YijieChen,YingLai Lin,YingtongHu,youshu,yuchaojie,yuedongli,YuJianfeng,zangqx,ZengZitao,zhangbuxue,zhangdanyang,zhangdong,zhangfanghe,zhangqi,zhangqinghua,zhangyanhui,zhangyinxia,zhangyongxian,zhangzhaoju,zhanzhan,zhengzuohe,ZhidanLiu,zhixinaa,zhoufeng,zhouyaqiang0,zhuguodong,zhupuxu,zhuyuxiao,zichun_ye,zjun,zlq2020,zong_shuai,ZPaC,zuochuanyong,zyli2020,陈宇,范吉斌,冯一航,胡彬,宦晓玲,黄勇,雷元哲,李良灿,李林杰,刘崇鸣,刘力力,刘勇琪,吕浩宇,吕昱峰(Nate.River),没有窗户的小巷,沈竞兴,十六夜,王程浩,王禹程,王振邦,徐安越,徐永飞,杨旭华,于振华,俞涵,张清华,张澍坤,张栩浩,张学同,赵英灼,周超,周洪叶,朱家兴 - -欢迎以任何形式对项目提供贡献! - -## MindSpore Lite 2.0.0-rc1 Release Notes +熊攀,ZhangZGC,yanghaoran,李林杰,shenwei41,xiaotianci,panzhihui,guozhijian,胡彬,tangmengcheng,XianglongZeng,cccc1111,stavewu,刘思铭,r1chardf1d0,jiangshanfeng -### 主要特性和增强 +## MindSpore Lite 2.3.1 Release Notes -#### MindSpore Lite云侧推理 +### 主要特性及增强 -原MindSpore Lite版本主要面向手机、车机等边缘设备,新增云侧推理版本支持云侧多后端硬件资源的场景,支持Ascend及Nvidia GPU推理专用卡,高效利用云侧多核资源。 +昇腾后端模型转换时,支持使用配置文件中的[input_shape 参数](https://www.mindspore.cn/lite/docs/zh-CN/r2.3.1/use/cloud_infer/converter_tool_ascend.html)来指定输入尺寸。 -原通过MindSpore训练版本集成的推理方式可以变更为基于MindSpore Lite进行适配集成,具体可参考[云侧推理快速入门](https://mindspore.cn/lite/docs/zh-CN/r2.0/quick_start/one_hour_introduction_cloud.html),如果想要保持原始集成方式可以参考[MindSpore推理FAQ](https://mindspore.cn/docs/zh-CN/r2.0/faq/inference.html)。 +### API 变更 -- [STABLE] 支持MindIR模型文件。 -- [STABLE] 支持将第三方Onnx、Tensorflow、Caffe模型通过MindSpore Lite转换工具转换为MindIR模型文件。 -- [STABLE] 一个发布包支持多种硬件后端:Ascend、Nvidia GPU、CPU。 -- [STABLE] 支持`Model`接口和`ModelParallelRunner`并行推理接口。 -- [STABLE] 支持C++、Python和Java推理接口。 +- [ModelGroup接口](https://www.mindspore.cn/lite/docs/zh-CN/r2.3.1/use/cloud_infer/runtime_cpp.html) 新增模型权重共享支持,节省显存。 +- [Model.get_model_info接口](https://www.mindspore.cn/lite/docs/zh-CN/r2.3.1/use/converter_tool.html?highlight=get_model_info) 新增支持获取模型的输入尺寸。 -#### API +### 贡献者 -- 因原Python API配置参数较多、使用较复杂,因此在2.0版本针对Python API易用性进行优化,包括类构造方法、类属性的调整等,此外2.0及之后的Python API将整合到云侧推理场景,与旧版本不兼容。详细参见[Python API说明文档](https://www.mindspore.cn/lite/api/zh-CN/r2.0/mindspore_lite.html)。 +熊攀;ZhangZGC;jxl;zhangyanhui;emmmmtang;huandong1;yefeng -## MindSpore 2.0.0-alpha Release Notes +## MindSpore Lite 2.3.0-rc2 Release Notes ### 主要特性和增强 -#### PyNative - -- MindSpore默认模式切换成PyNative模式。需要手动设置模式可以参考文档[计算图](https://www.mindspore.cn/tutorials/zh-CN/r2.0.0-alpha/advanced/compute_graph.html)。 -- 完成动态shape执行方案重构,提升反向构图性能,支持非padding方案的动态shape网络编程,当前主要验证网络Transformer-GPU、YOLOV5-GPU、ASR-Ascend。从[models仓](https://gitee.com/mindspore/models/tree/dynamic_shape)获取Transformer-GPU和YOLOV5-GPU。Ascend后端受算子适配度限制,只支持下列算子:Add、Assign、BatchMatMul、BiasAdd、BiasAddGrad、Cast、Conv2D、Conv2DBackpropFilter、Conv2DBackpropInput、CTCLoss、Div、Dropout、DropoutDoMask、Equal、ExpandDims、Gather、GetNext、LayerNorm、LayerNormGrad、LessEqual、Load、Log、LogicalAnd、LogicalNot、LogicalOr、LogSoftmax、LogSoftmaxGrad、MatMul、Maximum、Mul、Neg、NotEqual、NPUAllocFloatStatus、NPUClearFloatStatus、OneHot、RealDiv、Reciprocal、ReduceMean、ReduceSum、ReLU、ReluGrad、Reshape、Select、Softmax、StridedSlice、Sub、Tile、Transpose、UnsortedSegmentSum、ZerosLike。其余算子未经过完整验证,请酌情使用。 - -#### DataSet - -- TFRecordDataset API支持直接读取通过GZIP或ZLIB压缩后的TFRecord文件。 -- NumpySlicesDataset API支持同时处理不同维度的数据。 -- 优化错误日志信息的结构,展示更清晰的调用栈信息便于调试、定位问题。 -- 修复分布式训练场景下 `mindspore.dataset.config.set_seed` 对随机种子设置不生效的问题。 - -#### AutoParallel - -- 支持更多算子分布式能力。 - - Element Wise类算子:AddN、 BitwiseAnd、 BitwiseOr、 BitwiseXor、 CumProd、 HShrink、 HSigmoid、 IsFinite、 Mish、 MulNoNan、 Rint、 SeLU、 SoftShrink、 TruncateDiv、 TruncateMod、 Xdivy Xlogy、 InplaceAdd、 InplacSub、 InplaceUpdate、 Cdist、 L2Loss、 Lerp。 - - Math类算子:SquaredDifference、 Erfinv、 MaskedFill、 SplitV、 Gamma、 KLDivLoss、 LinSpace。Scatter类算子:ScatterAdd、ScatterDiv、ScatterMax、ScatterMul、ScatterNdAdd、ScatterNdSub、ScatterNdUpdate、ScatterSub、TensorScatterAdd、TensorScatterDiv、TensorScatterMax、TensorScatterMax、TensorScatterMul、TensorScatterAdd、TensorScatterUpdate。 - -- 增加`transform_checkpoints`和`transform_checkpoint_by_rank`接口。给定转换前后的策略文件,即可实现对分布式权重转换。详情请参考[分布式弹性训练与推理](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.0.0-alpha/parallel/resilience_train_and_predict.html)。 - -### API变更 - -#### 算子 - -- [STABLE] `mindspore.ops.AdaptiveMaxPool3D` 新增算子原语。 -- [STABLE] `mindspore.ops.AdjustHue` 新增算子原语。 -- [STABLE] `mindspore.ops.BartlettWindow` 新增算子原语。 -- [STABLE] `mindspore.ops.BesselJ0` 新增算子原语。 -- [STABLE] `mindspore.ops.BesselJ1` 新增算子原语。 -- [STABLE] `mindspore.ops.BesselK0` 新增算子原语。 -- [STABLE] `mindspore.ops.BesselK0e` 新增算子原语。 -- [STABLE] `mindspore.ops.BesselK1` 新增算子原语。 -- [STABLE] `mindspore.ops.BesselK1e` 新增算子原语。 -- [STABLE] `mindspore.ops.BesselY0` 新增算子原语。 -- [STABLE] `mindspore.ops.BesselY1` 新增算子原语。 -- [STABLE] `mindspore.ops.Betainc` 新增算子原语。 -- [STABLE] `mindspore.ops.Bincount` 新增算子原语。 -- [STABLE] `mindspore.ops.BlackmanWindow` 新增算子原语。 -- [STABLE] `mindspore.ops.Bucketize` 新增算子原语。 -- [STABLE] `mindspore.ops.CombinedNonMaxSuppression` 新增算子原语。 -- [STABLE] `mindspore.ops.CompareAndBitpack` 新增算子原语。 -- [STABLE] `mindspore.ops.Complex` 新增算子原语。 -- [STABLE] `mindspore.ops.DataFormatVecPermute` 新增算子原语。 -- [STABLE] `mindspore.ops.EuclideanNorm` 新增算子原语。 -- [STABLE] `mindspore.ops.Expand` 新增算子原语。 -- [STABLE] `mindspore.ops.ExtractGlimpse` 新增算子原语。 -- [STABLE] `mindspore.ops.FillDiagonal` 新增算子原语。 -- [STABLE] `mindspore.ops.FractionalAvgPool` 新增算子原语。 -- [STABLE] `mindspore.ops.FractionalMaxPool` 新增算子原语。 -- [STABLE] `mindspore.ops.Gcd` 新增算子原语。 -- [STABLE] `mindspore.ops.HammingWindow` 新增算子原语。 -- [STABLE] `mindspore.ops.Histogram` 新增算子原语。 -- [STABLE] `mindspore.ops.HSVToRGB` 新增算子原语。 -- [STABLE] `mindspore.ops.Lcm` 新增算子原语。 -- [STABLE] `mindspore.ops.LeftShift` 新增算子原语。 -- [STABLE] `mindspore.ops.ListDiff` 新增算子原语。 -- [STABLE] `mindspore.ops.LogSpace` 新增算子原语。 -- [STABLE] `mindspore.ops.Lstsq` 新增算子原语。 -- [STABLE] `mindspore.ops.MatrixDiagPartV3` 新增算子原语。 -- [STABLE] `mindspore.ops.MatrixDiagV3` 新增算子原语。 -- [STABLE] `mindspore.ops.MatrixExp` 新增算子原语。 -- [STABLE] `mindspore.ops.MatrixPower` 新增算子原语。 -- [STABLE] `mindspore.ops.MaxPool3DWithArgmax` 新增算子原语。 -- [STABLE] `mindspore.ops.MaxUnpool2D` 新增算子原语。 -- [STABLE] `mindspore.ops.MultilabelMarginLoss` 新增算子原语。 -- [STABLE] `mindspore.ops.NextAfter` 新增算子原语。 -- [STABLE] `mindspore.ops.Orgqr` 新增算子原语。 -- [STABLE] `mindspore.ops.ReduceStd` 新增算子原语。 -- [STABLE] `mindspore.ops.RGBToHSV` 新增算子原语。 -- [STABLE] `mindspore.ops.RightShift` 新增算子原语。 -- [STABLE] `mindspore.ops.SampleDistortedBoundingBoxV2` 新增算子原语。 -- [STABLE] `mindspore.ops.ScaleAndTranslate` 新增算子原语。 -- [STABLE] `mindspore.ops.ScatterAddWithAxis` 新增算子原语。 -- [STABLE] `mindspore.ops.ScatterNdDiv` 新增算子原语。 -- [STABLE] `mindspore.ops.ScatterNdMax` 新增算子原语。 -- [STABLE] `mindspore.ops.ScatterNdMul` 新增算子原语。 -- [STABLE] `mindspore.ops.STFT` 新增算子原语。 -- [STABLE] `mindspore.ops.Trace` 新增算子原语。 -- [STABLE] `mindspore.ops.UpsampleNearest3D` 新增算子原语。 -- [STABLE] `mindspore.ops.UpsampleTrilinear3D` 新增算子原语。 -- [STABLE] `mindspore.parallel.transform_checkpoints` 新增分布式权重转换接口。 -- [STABLE] `mindspore.parallel.transform_checkpoint_by_rank` 新增分布式权重转换接口。 - -#### 非兼容性变更 - -##### Python API - -- `mindspore.ms_function`接口名替换为`mindspore.jit`,`mindspore.ms_function` 将在未来版本中弃用并删除。 -- `mindspore.ms_class`接口名替换为`mindspore.jit_class`,`mindspore.ms_class` 将在未来版本中弃用并删除。 -- `mindspore.ops.ms_kernel`接口名替换为`mindspore.ops.kernel`,`mindspore.ops.ms_kernel` 将在未来版本中弃用并删除。 -- `mindspore.dataset.map`接口参数 `column_order` 不再生效,使用`mindspore.dataset.project`替换。 -- `mindspore.dataset.close_pool`、`mindspore.dataset.to_device`、`mindspore.dataset.set_dynamic_columns` 接口在之前版本已废弃,当前版本正式删除。 - -### Bug fixes - -- 修复混合精度函数式接口在图模式下不能修改后端驱动的问题。 -- 修复以下网络在单P场景下用户可自动传入device_id(mobilenetv1/fasterrcnn/yolov3/yolov4/yolov5/unet/openpose/simplepose/crnn/gnmtv2/faceattribute/facequality/facedetection) 。 +- [STABLE] 支持云侧转换工具所用的配置文件配置FlashAttention相关属性。 +- [STABLE] 支持在多张卡上进行内存共享。 ### 贡献者 感谢以下人员做出的贡献: -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking, shu-kun-zhang. +emmmmtang,熊攀 欢迎以任何形式对项目提供贡献! -## MindSpore 1.10.1 Release Notes +## MindSpore Lite 2.2.11 Release Notes ### 问题修复 -- 修复logsumexp防溢出处理中未考虑指定axis的问题 -- 修复proto文件的编译依赖问题 -- 修复print算子打印结果不正常的问题 -- 修复equal算子越界问题 -- 修复函数被@jit修饰后,导致的cell_id解析不正确的问题 -- 修复GNN场景数据类型校验错误 -- 修复Dataset map多进程退化成线程的问题 +- [#I8TPLY] 修复 SSD MobileNetV2 FPN 网络在Atlas 推理系列产品平台上的推理失败问题。 ### 贡献者 感谢以下人员做出的贡献: -archer2049, caifubi, chenfei_mindspore, gaoshuanglong, Greatpan, guozhijian, huoxinyou, Kxiong, lanzhineng, lijunbin, liubuyu, liuchuting, luochao60, lyqlola, nomindcarry, TuDouNi, xiaotianci, xupan, yangshuo, yefeng, YingtongHu, yuchaojie, zhoufeng, ZPaC, 刘勇琪, 吕昱峰, 王禹程, 于振华. +wangtongyu6, zhuguodong, 徐永飞, 徐安越, yeyunpeng2020, moran, XinDu, gengdongjie. 欢迎以任何形式对项目提供贡献! -## MindSpore 1.10.0 Release Notes - -### 主要特性和增强 - -#### DataSet - -- [STABLE]下沉模式超时等待时间调整,默认调整到600s,以解决数据下沉模式时因环境资源竞争、计算量大等因素容易导致GetNext算子等待超时的问题。 +## MindSpore Lite 2.2.10 Release Notes -### Bug fixes +### 问题修复 -- 修复AMP中部分Primitive算子无法在图模式下实例化导致接口不可用的问题。 -- 修复昇腾平台算力切分场景下LSTM网络中DynamicRNN算子执行失败的问题。 -- 修复mobilenet, fasterrcnn, yolo等网络单卡训练脚本DEVICE_ID在启动脚本中写死的问题。 +- [#I8K7CC]优化get_model_info接口传入非str字段的报错 ### 贡献者 感谢以下人员做出的贡献: -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking, shu-kun-zhang. +gengdongjie, zhangyanhui, xiaoxiongzhu, wangshaocong, jianghui58, moran, wangtongyu6, 徐安越, qinzheng, 徐永飞, youshu, XinDu, yeyunpeng2020, yefeng, wangpingan, zjun, 胡安东, 刘力力, 陈宇, chenjianping, kairui_kou, zhangdanyang, hangq, mengyuanli, 刘崇鸣 欢迎以任何形式对项目提供贡献! -## MindSpore Lite 1.10.0 Release Notes - -### Bug fixes - -- 修复Arithmetic类CPU算子动态shape场景下可能的计算精度问题。 -- 修复Deconv int8量化算子重量化写入地址错误问题。 - -## MindSpore 1.9.0 Release Notes - -### 主要特性和增强 - -#### FrontEnd - -- [STABLE] 新增面向对象+函数式融合编程范式,提供 `mindspore.amp.LossScaler` 、 `mindspore.amp.DynamicLossScaler` 、 `mindspore.amp.StaticLossScaler` 、 `mindspore.amp.auto_mixed_precision` 、 `mindspore.amp.all_finite` 等融合编程范式下的混合精度接口。 - -### API变更 - -#### 算子 - -- [STABLE] `nn.AdaptiveAvgPool3d` 新增nn接口。 -- [STABLE] `ops.adaptive_avg_pool3d` 新增functional接口。 -- [STABLE] `ops.addcdiv` 新增functional接口。 -- [STABLE] `ops.addcmul` 新增functional接口。 -- [STABLE] `ops.approximate_equal` 新增GPU、CPU支持。 -- [STABLE] `ops.atanh` 新增GPU支持。 -- [STABLE] `ops.bessel_i0` 新增GPU支持。 -- [STABLE] `ops.bessel_i0e` 新增Ascend支持。 -- [STABLE] `ops.bessel_i1` 新增GPU支持。 -- [STABLE] `ops.bessel_i1e` 新增Ascend、GPU支持。 -- [STABLE] `ops.bessel_j0` 新增GPU支持。 -- [STABLE] `ops.bessel_j1` 新增GPU支持。 -- [STABLE] `ops.bessel_k0` 新增GPU支持。 -- [STABLE] `ops.bessel_k0e` 新增GPU支持。 -- [STABLE] `ops.bessel_k1` 新增GPU支持。 -- [STABLE] `ops.bessel_k1e` 新增GPU支持。 -- [STABLE] `ops.bessel_y0` 新增GPU支持。 -- [STABLE] `ops.bessel_y1` 新增GPU支持。 -- [STABLE] `ops.bias_add` 新增functional接口。 -- [STABLE] `ops.bitwise_and` 新增GPU支持。 -- [STABLE] `ops.bitwise_or` 新增GPU支持。 -- [STABLE] `ops.bitwise_xor` 新增GPU支持。 -- [STABLE] `ops.grid_sample` 新增Ascend支持。 -- [STABLE] `ops.inplace_update` 新增CPU支持。 -- [STABLE] `ops.isclose` 新增Ascend、GPU支持。 -- [STABLE] `ops.isnan` 新增Ascend支持。 -- [STABLE] `ops.lerp` 新增GPU支持。 -- [STABLE] `ops.random_poisson` 新增functional接口。 -- [STABLE] `ops.reverse_sequence` 新增functional接口。 -- [STABLE] `ops.scatter_mul` 新增GPU支持。 -- [STABLE] `ops.scatter_nd_max` 新增functional接口。 -- [STABLE] `ops.scatter_nd_min` 新增functional接口。 -- [STABLE] `ops.SparseToDense` 新增GPU支持。 -- [STABLE] `ops.square` 新增functional接口。 -- [STABLE] `ops.standard_laplace` 新增GPU支持。 -- [STABLE] `ops.std` 新增functional接口。 -- [STABLE] `ops.trunc` 新增Ascend、GPU支持。 -- [STABLE] `ops.unsorted_segment_sum` 新增functional接口。 -- [STABLE] `ops.xdivy` 新增functional接口。 -- [STABLE] `ops.xlogy` 新增GPU支持。 -- `ops.poisson` 接口废弃使用,对应新接口为 `ops.random_poisson` 。 -- `ops.SparseApplyAdagrad` 接口废弃使用,可使用 `ops.SparseApplyAdagradV2` 接口替代。 - -### Bug fixes +## MindSpore Lite 2.2.1 Release Notes -- [BUGFIX] 修改混合精度O2 level的判断逻辑,在原来屏蔽 `BatchNorm1d` 、 `BatchNorm2d` 算子的基础上,添加另外两个屏蔽算子`BatchNorm3d`和`LayerNorm`,这4个算子依然用float32数据类型计算。 +### Bug Fixes -- [BUGFIX] Dataset处理字符串类型数据时,若调用`create_dict_iterator`或`create_tuple_iterator`接口时指定了`output_numpy=True`,获取到的数据会是`numpy.bytes_`类型。修复此问题后接口会直接返回`numpy.str_`类型数据,用户无需再对其进行字符串解码操作。同样,在使用自定义数据处理函数时,接收到的数据也将直接是`numpy.str_`类型,与原始数据类型相匹配。 +- [#I88055] 修复MindSpore Lite推理gridsample算子format设置错误的问题。 +- [#I8D80Y] 修复MindSpore Lite推理单算子调用流程资源释放异常的问题。 ### 贡献者 感谢以下人员做出的贡献: -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, liyanliu, lizhenyu, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, panfengfeng, panyifeng, Payne, peixu_ren, Pengyongrong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanyuan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking, shu-kun-zhang. +zhanghaibo, wangsiyuan, yefeng, wangshaocong, chenjianping 欢迎以任何形式对项目提供贡献! -## MindSpore 1.8.1 Release Notes - -### API变更 - -#### 算子 - -- [STABLE] ops.ApplyAdagradDA 新增GPU、CPU支持。 -- [STABLE] ops.ApplyAdagradV2 新增CPU支持。 -- [STABLE] ops.ApplyCenteredRmsProp 新增Ascend动态shape支持。 -- [STABLE] ops.ApplyFtrl 新增CPU支持。 -- [STABLE] ops.ApplyGradientDescent 新增CPU支持。 -- [STABLE] ops.ApplyPowerSign 新增CPU支持。 -- [STABLE] ops.ApplyProximalAdagrad 新增GPU、CPU支持。 -- [STABLE] ops.ApplyRmsProp 新增Ascend动态shape支持。 -- [STABLE] ops.max 新增functional接口。 -- [STABLE] ops.atan2 新增functional接口。 -- [STABLE] ops.cummax 新增GPU支持。 -- [STABLE] ops.cummin 新增GPU、CPU支持。 -- [STABLE] ops.diag 新增GPU支持。 -- [STABLE] ops.expand_dims 新增functional接口。 -- [STABLE] ops.gather_elements 新增functional接口。 -- [STABLE] ops.grid_sample 新增GPU支持。 -- [STABLE] ops.hardswish 新增Ascend支持。 -- [BETA] ops.index_fill 新增GPU支持。 -- [BETA] ops.inplace_update 新增CPU支持。 -- [BETA] nn.InstanceNorm1d 新增GPU支持。 -- [BETA] nn.InstanceNorm2d 新增GPU支持。 -- [BETA] nn.InstanceNorm3d 新增GPU支持。 -- [STABLE] ops.log1p 新增functional接口。 -- [STABLE] ops.masked_fill 新增GPU、CPU支持。 -- [BETA] ops.matrix_diag_part 新增GPU支持。 -- [BETA] ops.matrix_diag 新增GPU支持。 -- [BETA] ops.matrix_set_diag 新增GPU支持。 -- [STABLE] ops.max_pool3d 新增GPU支持。 -- [STABLE] ops.nll_loss 新增functional接口。 -- [STABLE] ops.one_hot 新增functional接口。 -- [STABLE] ops.pad 新增functional接口。 -- [STABLE] ops.random_gamma 新增CPU支持。 -- [STABLE] ops.amax 新增functional接口。 -- [STABLE] ops.mean 新增functional接口。 -- [STABLE] ops.amin 新增functional接口。 -- [STABLE] ops.prod 新增functional接口。 -- [STABLE] ops.renorm 新增Ascend、GPU、CPU支持。 -- [BETA] ops.tensor_scatter_elements 新增Ascend、GPU、CPU支持。 -- [STABLE] ops.scatter_max 新增GPU支持。 -- [STABLE] ops.scatter_min 新增GPU支持。 -- [STABLE] ops.scatter_nd 新增functional接口。 -- [STABLE] ops.scatter_nd_max 新增GPU支持。 -- [STABLE] ops.scatter_update 新增functional接口。 -- [STABLE] ops.binary_cross_entropy_with_logits 新增CPU支持。 -- [STABLE] ops.smooth_l1_loss 新增functional接口。 -- [STABLE] ops.space_to_batch_nd 新增CPU支持。 -- [STABLE] ops.SparseApplyAdagrad 新增GPU、CPU支持。 -- [STABLE] ops.sparse_segment_mean 新增GPU、CPU支持。 -- [STABLE] ops.squeeze 新增functional接口。 -- [STABLE] ops.standard_laplace 新增CPU支持。 -- [BETA] nn.ReflectionPad1d 新增Ascend、GPU、CPU支持。 -- [BETA] nn.ReflectionPad2d 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.SiLU 新增Ascend、GPU、CPU支持。 -- [STABLE] ops.transpose 新增functional接口。 -- [STABLE] ops.uniform_candidate_sampler 新增CPU支持。 -- [STABLE] ops.uniform 新增functional接口。 -- [STABLE] ops.unique_with_pad 新增GPU支持。 -- [STABLE] ops.unstack 新增functional接口。 -- [BETA] ops.interpolate 新增GPU、CPU支持。 -- [STABLE] ops.xdivy 新增CPU支持。 -- [STABLE] ops.xlogy 新增CPU支持。 - -## MindSpore 1.8.0 Release Notes +## MindSpore Lite 2.2.0 Release Notes ### 主要特性和增强 -#### FrontEnd - -- [BETA] 提供`mindspore.train.Model.fit` API,增加两种callback方法 `mindspore.train.callback.EarlyStopping` 和 `mindspore.train.callback.ReduceLROnPlateau`。 -- [BETA] 自定义算子支持Julia算子。 -- [BETA] 自定义算子支持Hybrid DSL算子。 -- [STABLE] export()接口支持自定义加密算法导出模型,load()接口支持自定义解密算法导入模型。 -- [BETA] [动静统一] [易用性] 图编译支持常量类型设置可变(1.8版本支持tuple/list/dict)。 -- [BETA] [动静统一] 常量场景下控制流内支持JIT Fallback功能。 -- [STABLE] [动静统一] 支持图模式常量场景下Python raise语句。 -- [STABLE] [动静统一] 支持图模式常量场景下Python assert语句。 -- [STABLE] [动静统一] 支持图模式常量场景下Python print语句。 -- [STABLE] [动静统一] 支持图模式str.format()方法。 -- [STABLE] [动静统一] 支持图模式用slice方法对list赋值。 -- [STABLE] [动静统一] 图模式支持创建和调用自定义类的实例。 -- [STABLE] [动静统一] 支持从Cell数组/自定义类数组中获取类的属性。 -- [STABLE] [动静统一] 图模式下isinstance支持场景扩展。 -- [STABLE] 自定义算子修饰符'ms_hybrid'重名为'ms_kernel'。 -- [BETA] 自定义算子Hybrid DSL支持CPU后端。 -- [BETA] 自定义算子昇腾后端新增自定义调度原语语法支持。 +#### 支持FlashAttention算子融合 -#### PyNative +- [STABLE] 在Ascend系列硬件上,支持LLAMA、stable diffusion系列模型的FlashAttention大算子融合。 -- [STABLE] 实现AdamWeightDecay算子,替代原有小算子组合方式。 -- [STABLE] 动态图下使用动静结合的方式执行优化器。 -- [STABLE] 优化PyNative反向图和ms_function的执行性能。 +## MindSpore Lite 2.1.1 Release Notes -#### Auto Parallel +### Major Features and Improvements -- [STABLE] 对接AllToAll单算子模式。在图编译等级为O0下,支持AllToAll算子调用。 -- [STABLE] 整图下沉支持MPI启动。整图下沉的模式下,支持使用MPI的方式启动。 -- [STABLE] 模型权重的Seed提供并行接口配置。在用户不通过mindspore.set_seed设置随机数种子时,每个参数初始化的随机数种子为当前分片索引决定。当配置随机数种子之后,相同shape以及相同切分策略的权重,其初始化的结果一致。 -- [STABLE] HCCL屏蔽内部全连接/非全连接。允许一次训练过程中同时有全连接AllToAllv和分级AllToAllv。 -- [BETA] CPU优化器融合。通过优化器跨参数融合,将多个优化器算子按数据类型融合成,带来性能提升。目前已在CPU AdamWeightDecay优化器上做过验证。用户可以通过网络cell类中的flatten_weights方法启用该功能。 +- [STABLE] MindSpore Lite Cloud Inference adds support for Python 3.8 and Python 3.9 -#### Executor +## MindSpore Lite 2.1.0 Release Notes -- [STABLE] 开放南向芯片对接接口。 -- [STABLE] 使用多Actor融合执行提升运行时的执行性能。 -- [STABLE] NopOp算子(eg. Reshape)执行消除。 -- [STABLE] Embedding Cache架构切换统一分布式运行时。 -- [STABLE] Parameter Server训练切换统一分布式运行时。 -- [STABLE] 支持CPU Parameter Server模式训练。 +### 主要特性和增强 -#### DataSet +#### MindSpore Lite云侧推理 -- [STABLE] 对于数据集对象使用map操作时,同时num_parallel_workers>1并且python_multiprocessing=True时,进行了多进程的机制优化,使得数据通道与子进程一一映射,避免了过多的文件句柄占用,同时close_pool这个接口也被删除。 -- [STABLE] 新增一批Vision、Text和Audio类数据增强操作。 -- [STABLE] 修复数据集类的flat_map方法未将结果展平的错误。 -- [STABLE] 统一数据集增强API的导入路径,提供更简单的使用方法,请参阅[最新的API用法](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/mindspore.dataset.vision.html)。 +- [STABLE] 支持Ascend硬件后端单卡大模型以及单机多卡分布式大模型高性能推理。 +- [STABLE] Python API Ascend后端支持多模型共享工作空间(Workspace)内存。 +- [STABLE] [通过ModelGroup新增支持多模型共享权重](https://mindspore.cn/lite/docs/zh-CN/r2.1/use/cloud_infer/runtime_cpp.html#%E5%A4%9A%E6%A8%A1%E5%9E%8B%E5%85%B1%E4%BA%AB%E6%9D%83%E9%87%8D),比如大模型场景下全量模型和增量模型共享权重。 -### API变更 +#### API -#### 算子 +新增ModelGroup [Python](https://www.mindspore.cn/lite/api/zh-CN/r2.1/mindspore_lite/mindspore_lite.ModelGroup.html#mindspore_lite.ModelGroup)和[C++](https://mindspore.cn/lite/api/zh-CN/r2.1/api_cpp/mindspore.html#modelgroup)接口,接口定义如下: -- [STABLE] ops.adaptive_avg_pool2d 新增GPU支持。 -- [BETA] ops.adaptive_max_pool2d 新增Ascend、GPU、CPU支持。 -- [BETA] ops.approximate_equal 新增CPU支持。 -- [STABLE] ops.argmin 新增CPU支持。 -- [BETA] ops.assign_sub 新增CPU支持。 -- [STABLE] ops.bernoulli 新增GPU支持。 -- [BETA] ops.bessel_i0 新增CPU支持。 -- [BETA] ops.bessel_i0e 新增CPU支持。 -- [BETA] ops.bessel_i1 新增CPU支持。 -- [BETA] ops.bessel_i1e 新增CPU支持。 -- [STABLE] ops.bessel_j0 新增CPU支持。 -- [STABLE] ops.bessel_j1 新增CPU支持。 -- [STABLE] ops.bessel_k0 新增CPU支持。 -- [STABLE] ops.bessel_k0e 新增CPU支持。 -- [BETA] ops.bessel_k1 新增CPU支持。 -- [BETA] ops.bessel_k1e 新增CPU支持。 -- [STABLE] ops.bessel_y0 新增CPU支持。 -- [STABLE] ops.bessel_y1 新增CPU支持。 -- [STABLE] ops.bitwise_and 新增CPU支持。 -- [STABLE] ops.bitwise_or 新增CPU支持。 -- [STABLE] ops.bitwise_xor 新增CPU支持。 -- [STABLE] ops.broadcast_to 新增functional接口。 -- [BETA] ops.ceil 新增GPU、CPU支持。 -- [BETA] ops.col2im 新增GPU支持。 -- [BETA] ops.concat 新增functional接口。 -- [STABLE] ops.cosh 新增GPU支持。 -- [STABLE] ops.ctc_greedy_decoder 新增Ascend、CPU支持。 -- [BETA] ops.DataFormatDimMap 新增GPU、CPU支持。 -- [BETA] ops.dropout2d 新增GPU、CPU支持。 -- [BETA] ops.dropout3d 新增CPU支持。 -- [BETA] ops.erf 新增CPU支持。 -- [BETA] ops.erfc 新增CPU支持。 -- [STABLE] ops.expand_dims 新增functional接口。 -- [STABLE] ops.fast_gelu 新增GPU、CPU支持。 -- [STABLE] ops.flatten Ascend动态shape支持。 -- [BETA] ops.ger 新增GPU、CPU支持。 -- [STABLE] ops.gumbel_softmax 新增Ascend、GPU、CPU支持。 -- [BETA] ops.hardshrink 新增GPU、CPU支持。 -- [BETA] ops.index_add 新增CPU支持。 -- [BETA] ops.inplace_add 新增CPU支持。 -- [BETA] ops.inplace_sub 新增CPU支持。 -- [STABLE] ops.intopk 新增CPU支持。 -- [STABLE] ops.inv 新增GPU、CPU支持。 -- [STABLE] ops.invert 新增GPU、CPU支持。 -- [BETA] ops.isclose 新增CPU支持。 -- [STABLE] ops.lerp 新增CPU支持。 -- [BETA] ops.linspace 新增CPU支持。 -- [BETA] ops.log_softmax 新增functional接口。 -- [BETA] ops.norm 新增Ascend、GPU、CPU支持。 -- [BETA] ops.lrn 新增CPU支持。 -- [BETA] ops.masked_select 新增GPU支持。 -- [BETA] ops.matrix_band_part 新增GPU、CPU支持。 -- [BETA] ops.matrix_solve 新增GPU、CPU支持。 -- [BETA] ops.meshgrid 新增CPU支持。 -- [STABLE] ops.mish 新增CPU支持。 -- [BETA] ops.nonzero 新增GPU支持。 -- [STABLE] ops.padding 新增GPU、CPU支持。 -- [BETA] ops.pow 新增Ascend动态shape支持。 -- [BETA] ops.range 新增functional接口。 -- [BETA] ops.round 新增Ascend动态shape支持。 -- [STABLE] ops.scatter_add 新增Ascend动态shape支持。 -- [STABLE] ops.scatter_div 新增Ascend动态shape支持。 -- [BETA] ops.scatter_max 新增GPU支持。 -- [BETA] ops.scatter_min 新增GPU支持。 -- [BETA] ops.scatter_nd_add 新增CPU支持。 -- [STABLE] ops.scatter_nd_div 新增GPU、CPU支持。 -- [STABLE] ops.scatter_nd_min 新增GPU、CPU支持。 -- [STABLE] ops.scatter_nd_mul 新增GPU、CPU支持。 -- [BETA] ops.scatter_nd_sub 新增CPU支持。 -- [STABLE] ops.scatter_update 新增Ascend动态shape支持。 -- [BETA] ops.select 新增Ascend动态shape支持。 -- [BETA] ops.selu 新增GPU、CPU支持。 -- [BETA] ops.soft_shrink 新增GPU、CPU支持。 -- [BETA] ops.softsign 新增CPU支持。 -- [STABLE] ops.tan 新增GPU支持。 -- [BETA] ops.tensor_scatter_add 新增Ascend、CPU支持。 -- [STABLE] ops.tensor_scatter_div 新增GPU、CPU支持。 -- [STABLE] ops.tensor_scatter_mul 新增GPU、CPU支持。 -- [BETA] ops.tensor_scatter_sub 新增Ascend、CPU支持。 -- [STABLE] nn.AdaptiveAvgPool1d 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.AdaptiveMaxPool1d 新增Ascend、GPU、CPU支持。 -- [BETA] nn.BiDense 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.ConstantPad1d 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.ConstantPad2d 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.ConstantPad3d 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.Hardtanh 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.HuberLoss 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.RReLU 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.Tanhshrink 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.Threshold 新增Ascend、GPU、CPU支持。 -- [STABLE] nn.ZeroPad2d 新增Ascend、GPU、CPU支持。 -- [BETA] ops.unique_consecutive 新增GPU支持。 -- [STABLE] ops.unsorted_segment_max 新增CPU支持。 -- [STABLE] ops.unsorted_segment_min 新增CPU支持。 -- [STABLE] ops.unsorted_segment_prod 新增GPU支持。 +```python +class ModelGroup + def __init__(self, flags=ModelGroupFlag.SHARE_WORKSPACE) + def add_model(self, models) + def cal_max_size_of_workspace(self, model_type, context) +``` -#### 非兼容性变更 +```C++ +// class ModelGroup +ModelGroup(ModelGroupFlag flags = ModelGroupFlag::kShareWorkspace); +Status AddModel(const std::vector &model_path_list); +Status AddModel(const std::vector> &model_buff_list); +Status AddModel(const std::vector &model_list); +Status AddModel(const std::vector &model_list); +``` -##### Python API +## MindSpore Lite 2.0.0-rc1 Release Notes -- 不再支持DVPP模拟算法,删除 `mindspore.dataset.vision.c_transforms.SoftDvppDecodeRandomCropResizeJpeg` 和 `mindspore.dataset.vision.c_transforms.SoftDvppDecodeResizeJpeg` 接口。 -- LossMonitor中增加`on_train_epoch_end` 方法,实现在 `mindspore.train.Model.fit` 中使用时,打印epoch级别的metric信息。 -- TimeMonitor打印内容变更,打印内容加入"train"或"eval"用于区分训练和推理阶段。 -- load_checkpoint 接口的`filter_prefix`:不再支持空字符串(""),匹配规则由强匹配修改为模糊匹配。 +### 主要特性和增强 -#### import优化 +#### MindSpore Lite云侧推理 -mindspore.context、mindspore.parallel、mindspore.profiler、mindspore.train模块的接口可直接在mindspore模块使用。原有用法仍可以继续支持。 +原MindSpore Lite版本主要面向手机、车机等边缘设备,新增云侧推理版本支持云侧多后端硬件资源的场景,支持Ascend及Nvidia GPU推理专用卡,高效利用云侧多核资源。 -例如: +原通过MindSpore训练版本集成的推理方式可以变更为基于MindSpore Lite进行适配集成,具体可参考[云侧推理快速入门](https://mindspore.cn/lite/docs/zh-CN/r2.0/quick_start/one_hour_introduction_cloud.html),如果想要保持原始集成方式可以参考[MindSpore推理FAQ](https://mindspore.cn/docs/zh-CN/r2.0/faq/inference.html)。 -- `mindspore.context.set_context`可简化为`mindspore.set_context`。 -- `mindspore.parallel.set_algo_parameters`可简化为`mindspore.set_algo_parameters`。 -- `mindspore.profiler.Profiler`可简化为`mindspore.Profiler`。 -- `mindspore.train.callback.Callback`可简化为`mindspore.train.Callback`。 +- [STABLE] 支持MindIR模型文件。 +- [STABLE] 支持将第三方Onnx、Tensorflow、Caffe模型通过MindSpore Lite转换工具转换为MindIR模型文件。 +- [STABLE] 一个发布包支持多种硬件后端:Ascend、Nvidia GPU、CPU。 +- [STABLE] 支持`Model`接口和`ModelParallelRunner`并行推理接口。 +- [STABLE] 支持C++、Python和Java推理接口。 -API页面统一汇总至:。 +#### API -### 贡献者 +- 因原Python API配置参数较多、使用较复杂,因此在2.0版本针对Python API易用性进行优化,包括类构造方法、类属性的调整等,此外2.0及之后的Python API将整合到云侧推理场景,与旧版本不兼容。详细参见[Python API说明文档](https://www.mindspore.cn/lite/api/zh-CN/r2.0/mindspore_lite.html)。 -感谢以下人员做出的贡献: +## MindSpore Lite 1.10.0 Release Notes -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking, shu-kun-zhang. +### Bug fixes -欢迎以任何形式对项目提供贡献! +- 修复Arithmetic类CPU算子动态shape场景下可能的计算精度问题。 +- 修复Deconv int8量化算子重量化写入地址错误问题。 ## MindSpore Lite 1.8.0 Release Notes @@ -4370,82 +174,6 @@ AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bing - [STABLE] 后量化支持PerLayer量化,同时内置CLE算法优化精度。 -## MindSpore 1.7.0 Release Notes - -### 主要特性和增强 - -#### OS - -- [STABLE] 支持Python 3.8版本(Linux/Windows/Mac)。 -- [STABLE] 简化安装,提供详细安装指南和自动化安装脚本。 -- [STABLE] Windows版本支持算子多线程。 -- [STABLE] GCC兼容7.3到9.x版本。 - -#### FrontEnd - -- [STABLE] 优化器支持动态权重衰减,即训练期间权重衰减值随着step的增加而变化。 -- [STABLE] 增加四种创建Tensor的方法,分别是`mindspore.numpy.rand()`、`mindspore.numpy.randn()`、`mindspore.numpy.randint()`和`mindspore.ops.arange ()`。 -- [STABLE] 增加一种callback方法 `mindspore.train.callback.History`。 -- [BETA] 自定义算子支持Julia算子。 -- [STABLE] 通过 `mindspore.ms_class` 类装饰器,支持获取用户自定义类的属性和方法。 -- [STABLE] 支持同时存在副作用算子和控制流语句的网络的训练。 -- [STABLE] 支持更复杂的控制流语法,比如在while的循环体里使用for语句。 -- [STABLE] 通过减少子图数量,提升包含复杂控制流语法的网络的性能。 - -#### PyNative - -- [STABLE] 在PyNative模式下支持hook函数功能,包括前向hook接口register_forward_pre_hook、register_forward_hook和反向hook接口register_backward_hook。 -- [STABLE] 优化PyNative模式执行性能,并行执行前端Python与后端C++。 - -#### Auto Parallel - -- [STABLE] 在MoE场景中支持TopK的路由、数据并行和优化器切分。 -- [STABLE] 支持AllGather/ReduceScatter通信算子融合,在DATA_PARALLEL模式支持AllReduce按数据量大小编译。 -- [STABLE] 在并行模式下支持ops.clip_by_global_norm。 -- [STABLE] 在并行模式下支持AdaSum优化器。 -- [STABLE] 支持自动优化器切分。 -- [STABLE] 支持AlltoAll可配置开启,支持自动插入VirtualDatasetCell。 -- [STABLE] 在流水线并行训练中,支持自动推断可训练的参数。 -- [STABLE] 支持集群的设备数目不为2的幂次方。 -- [STABLE] 在自动并行模式中支持策略传播。 -- [STABLE] 在统一运行时中支持异构训练。 -- [STABLE] 支持CPU的Adafactor算子。 -- [STABLE] 支持Conv2d/Conv2D的H/W轴切分和Transpose算子。支持ResizeBilinear、ROIAlign、CropAndResize、BoundingBoxEncode、IOU和RandomChoiceWithMask等分布式算子。 - -#### Executor - -- [BETA] [数据并行训练容灾](https://www.mindspore.cn/tutorials/experts/zh-CN/r1.7/parallel/train_gpu.html#%E5%AE%B9%E7%81%BE%E6%81%A2%E5%A4%8D) 支持多卡数据并行训练容灾恢复。 -- [BETA] 支持在CPU下的线程数搜索,获取最优线程数来执行。整个搜索过程需要耗时50个steps,整体的性能会在50个steps后达到稳定的状态。在测试性能的时候,需要以50个steps之后的数据作为标准。 - -#### DataSet - -- [STABLE] 增加了数据处理API的差异文档,比较TensorFlow.data与MindSpore.dataset部分算子的差异,详见 [对比文档](https://www.mindspore.cn/docs/zh-CN/r1.7/note/api_mapping/tensorflow_api_mapping.html#tf-data)。 -- [STABLE] Python多进程逻辑优化,保证不同异常场景的正常退出。 -- [STABLE] 支持[自动数据加速](https://www.mindspore.cn/tutorials/experts/zh-CN/r1.7/debug/dataset_autotune.html),可以自适应调节数据处理管道的执行速度。 -- [BETA] [数据处理异构加速](https://www.mindspore.cn/docs/zh-CN/r1.7/design/dataset_offload.html) 支持了新的数据增强操作: RandomColorAdjust、RandomSharpness和TypeCast。 -- GeneratorDataset加载自定义数据集时,当`__getitem__/__next__`方法返回单个NumPy对象,对应会输出单个数据列。 -- 用户在数据预处理中使用过多的进程数/线程数情况下,会出现错误RuntimeError: can't start new thread,可以通过 `ulimit -u 10240` 增加当前用户可用的线程/进程数解决。 - -### API变更 - -#### 非兼容性变更 - -##### Python API - -- 修改register_backward_hook功能对应hook的梯度返回值类型,将梯度返回值统一改成tuple类型。([!31876](https://gitee.com/mindspore/mindspore/pulls/31876)) -- 弃用的import用法: `import mindspore.dataset.engine.datasets as ds` ,因其import目录过深且过度依赖Python目录结构。推荐使用 `import mindspore.dataset as ds` ,更多参考详见 [API文档](https://www.mindspore.cn/docs/zh-CN/r1.7/api_python/mindspore.dataset.html)。 -- 新增`mindspore.ms_class` 接口,作为用户自定义类的类装饰器,使得MindSpore能够识别用户自定义类,并且获取这些类的属性和方法。([!30855](https://gitee.com/mindspore/mindspore/pulls/30855)) -- `mindspore.SparseTensor`接口废弃使用,对应新接口为`mindspore.COOTensor`。 ([!28505](https://gitee.com/mindspore/mindspore/pulls/28505)) -- Tensor新增一个入参`internal`,作为框架内部使用。 - -### 贡献者 - -感谢以下人员做出的贡献: - -AGroupofProbiotocs, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hesham, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jiabin Liu, jianghui58, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, liuyongqi, laiyongqiang, leonwanghui, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, lvchangquan, lvliang, lz, maning202007, Margaret_wangrui, mengyuanli, Ming_blue, ms_yan, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiao Tianci, Xiaoda, xiefangqi, xinyunfan, xuanyue, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Ziyan, zjun, ZPaC, wangfengwfwf, zymaa, gerayking. - -欢迎以任何形式对项目提供贡献! - ## MindSpore Lite 1.7.0 Release Notes ### 主要特性和增强 diff --git a/SECURITY.md b/SECURITY.md index 2d653dbba68bf0629613c23496e47ffe5ba90f3c..ab77ce748f6a0fbb5b3eafa62935f5a62ec28d2b 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -1,20 +1,3 @@ -# Security for MindSpore training - -## Security Risk Description - -1. When MindSpore is used for AI model training, if the user-defined computational graph structure (for example, Python code for generating the MindSpore computational graph) is provided by an untrusted third party, malicious code may exist and will be loaded and executed to attack the system. -2. Model files are stored in binary mode. When MindSpore is used to optimize or infer AI models and the model files are loaded in deserialization mode, once malicious code is written into the model files, the code are loaded and executed, causing attacks on the system. -3. MindSpore performs only model training and inference based on the data provided by users. Users need to protect data security to avoid privacy leakage. -4. MindSpore is a distributed training platform. When MindSpore is used for distributed training, if an Ascend chip is used for training, a device provides a secure transmission protocol for gradient fusion. If GPUs or other clusters are used for training, identity authentication and secure transmission are not provided. - -## Security Usage Suggestions - -1. Run MindSpore in the sandbox. -2. Run MindSpore as a non-root user. -3. Ensure that the source of a computational graph structure is trustworthy. Do not write code irrelevant to model training in the network structure definition. -4. Ensure that the source of a network model is trustworthy or enter secure network model parameters to prevent model parameters from being tampered with. -5. Ensure that GPU distributed training is performed on an isolated cluster network. - # Security for MindSpore Lite ## Security Risk Description diff --git a/docs/MindSpore-Lite-architecture.png b/docs/MindSpore-Lite-architecture.png index abf28796690f5649f8bc92382dfd4c2c83187620..78013a7c85c59e288bd0f7dc3766e9e57716cad3 100644 Binary files a/docs/MindSpore-Lite-architecture.png and b/docs/MindSpore-Lite-architecture.png differ diff --git a/docs/README.md b/docs/README.md index eef57183c40a94f7ae7d630f18dce1c0882b0dbc..0b380749c0514939746e692894636610fb7f804e 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,3 +1,3 @@ -# MindSpore Documentation +# MindSpore Lite Documentation -The MindSpore documentation is in the [MindSpore Docs](https://gitee.com/mindspore/docs) repository. +The MindSpore Lite documentation is in the [MindSpore Lite Docs](https://gitee.com/mindspore/docs) repository.