From 3ab2920a8f22df8442ddf7439d9d17f32f3c352d Mon Sep 17 00:00:00 2001 From: zyw_hw Date: Wed, 24 Sep 2025 19:41:24 +0800 Subject: [PATCH] fix api parameters statement --- .../docs/source_en/feature/ckpt.md | 28 +++++++++++++++---- .../docs/source_en/guide/deployment.md | 17 +++++++++++ .../docs/source_en/guide/evaluation.md | 1 + .../docs/source_zh_cn/feature/ckpt.md | 28 +++++++++++++++---- .../docs/source_zh_cn/guide/deployment.md | 19 ++++++++++++- .../docs/source_zh_cn/guide/evaluation.md | 1 + 6 files changed, 83 insertions(+), 11 deletions(-) diff --git a/docs/mindformers/docs/source_en/feature/ckpt.md b/docs/mindformers/docs/source_en/feature/ckpt.md index fcc182e618..18c0a5e3ae 100644 --- a/docs/mindformers/docs/source_en/feature/ckpt.md +++ b/docs/mindformers/docs/source_en/feature/ckpt.md @@ -22,7 +22,7 @@ MindSpore Transformers provides a unified weight conversion tool that allows mod To perform weight conversion, clone the complete HuggingFace repository of the model to be converted locally, and execute the `mindformers/convert_weight.py` script. This script automatically converts the HuggingFace model weight file into a weight file applicable to MindSpore Transformers. If you want to convert a MindSpore Transformers weight to a HuggingFace one, set `reversed` to `True`. ```shell -python convert_weight.py [-h] --model MODEL [--reversed] --input_path INPUT_PATH --output_path OUTPUT_PATH [--dtype DTYPE] [--n_head N_HEAD] [--hidden_size HIDDEN_SIZE] [--layers LAYERS] [--is_pretrain IS_PRETRAIN] [--telechat_type TELECHAT_TYPE] +python convert_weight.py [-h] --model MODEL [--reversed] --input_path INPUT_PATH --output_path OUTPUT_PATH [--dtype DTYPE] [--telechat_type TELECHAT_TYPE] ``` #### Parameters @@ -32,10 +32,6 @@ python convert_weight.py [-h] --model MODEL [--reversed] --input_path INPUT_PATH - input_path: path of the HuggingFace weight folder, which points to the downloaded weight file. - output_path: path for storing the MindSpore Transformers weight file after conversion. - dtype: weight data type after conversion. -- n_head: takes effect only for the BLOOM model. Set this parameter to `16` when `bloom_560m` is used and to `32` when `bloom_7.1b` is used. -- hidden_size: takes effect only for the BLOOM model. Set this parameter to `1024` when `bloom_560m` is used and to `4096` when `bloom_7.1b` is used. -- layers: number of layers to be converted. This parameter takes effect only for the GPT2 and WizardCoder models. -- is_pretrain: converts the pre-trained weight. This parameter takes effect only for the Swin model. - telechat_type: version of the TeleChat model. This parameter takes effect only for the TeleChat model. ### Conversion Example @@ -395,6 +391,26 @@ If there is no shared disk between servers, you need to use the offline weight c 16 2 ``` +**Parameters** + +- Parameters for transform_checkpoint.py conversion + + | Parameter | Description | + |-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| + | src_checkpoint | Absolute path or folder path of the source weight.
- For **a complete set of weights**, set this parameter to an **absolute path**.
- For **distributed weights**, set this parameter to the **folder path**. The distributed weights must be stored in the `model_dir/rank_x/xxx.ckpt` format. The folder path is `model_dir`.
**If there are multiple CKPT files in the rank_x folder, the last CKPT file in the file name sequence is used for conversion by default.** | + | src_strategy | Path of the distributed strategy file corresponding to the source weight.
- For a complete set of weights, leave it **blank**.
- For distributed weights, if pipeline parallelism is used, set this parameter to the **merged strategy file path** or **distributed strategy folder path**.
- For distributed weights, if pipeline parallelism is not used, set this parameter to any **ckpt_strategy_rank_x.ckpt** path. | + | dst_checkpoint_dir | Path of the folder that stores the target weight. | + | dst_strategy | Path of the distributed strategy file corresponding to the target weight.
- For a complete set of weights, leave it **blank**.
- For distributed weights, if pipeline parallelism is used, set this parameter to the **merged strategy file path** or **distributed strategy folder path**.
- For distributed weights, if pipeline parallelism is not used, set this parameter to any **ckpt_strategy_rank_x.ckpt** path. | + | prefix | Prefix name of the saved target weight. The weight is saved as {prefix}rank_x.ckpt. The default value is checkpoint_. | + | rank_id | Rank ID of the current conversion process. Single-process is not required. | + | world_size | Total number of slices of the target weight. Generally, the value is dp \* mp \* pp. Single-process is not required. | + | transform_process_num | Number of processes used for offline weight conversion. The default value is 1.
- If process_num is set to 1, **a single process is used for conversion**.
- If process_num is larger than 1, **multi-process conversion** is used. For example, if the target weight for conversion is the distributed weight of eight GPUs and process_num is set to 2, two processes are started to convert the weights of slices rank_0, rank_1, rank_2, and rank_3 and slices rank_4, rank_5, rank_6, and rank_7, respectively. | + | transform_by_rank | Whether the mindspore.transform_checkpoint_by_rank is used for checkpoint transform. It will automatically be set to True when transform_process_num > 1. | + +- Additional parameters used for transform_checkpoint.sh conversion + + For parameter descriptions, please refer to the parameters used in the transformation of transform_checkpoint.py. The order of parameters is src_checkpoint, src_strategy, dst_checkpoint_dir, dst_strategy, world_size, transform_process_num, prefix. + - **Copy the weights to other nodes.** Copy the distributed weights that have been converted to respective nodes. Node 0 requires only the weights of slices from `rank_0` to `rank_7`, and node 1 requires only the weights of slices from `rank_8` to `rank_15`. @@ -446,9 +462,11 @@ python mindformers/tools/transform_ckpt_lora.py \ **Note**: If a `merged_ckpt_strategy.ckpt` already exists in the strategy folder and is still transferred to the folder path, the script deletes the old `merged_ckpt_strategy.ckpt` and then merges files into a new `merged_ckpt_strategy.ckpt` for weight conversion. Therefore, ensure that the folder has enough write permission. Otherwise, an error will be reported. - **src_ckpt_path_or_dir**: specifies the path of the source weight. For distributed weights, set the parameter to the path of the folder where the source weights are located. The source weights must be stored in the `model_dir/rank_x/xxx.ckpt` format, and the folder path must be set to `model_dir`. If the source is a complete set of weights, set the parameter to an absolute path. +- **dst_ckpt_strategy**: The distributed policy file path corresponding to the target weight. - **dst_ckpt_dir**: specifies the path for storing the target weight, which must be a user-defined path of an empty folder. The target weight is saved in the `model_dir/rank_x/xxx.ckpt` format. - **prefix**: name prefix of the target weight file. The default value is "checkpoint_", indicating that the target weight is saved in the `model_dir/rank_x/checkpoint_x.ckpt` format. - **lora_scaling**: combination coefficient of the LoRA weight. The default value is `lora_alpha/lora_rank`. The two parameters are used for LoRA model configuration and need to be calculated. +- **save_format**: The format for saving target weights. The default value is `ckpt`. #### Examples diff --git a/docs/mindformers/docs/source_en/guide/deployment.md b/docs/mindformers/docs/source_en/guide/deployment.md index 06fe794d99..dfd9a2ba1a 100644 --- a/docs/mindformers/docs/source_en/guide/deployment.md +++ b/docs/mindformers/docs/source_en/guide/deployment.md @@ -189,6 +189,23 @@ bash run_mindie.sh --model-name xxx --model-path /path/to/model --model-name: Mandatory, set MindIE backend name --model-path: Mandatory, set model folder path, such as /path/to/mf_model/qwen1_5_72b --help : Instructions for using the script +--max-seq-len: Maximum sequence length. Default value: 2560. +--max-iter-times: Global maximum output length of the model. Default value: 512. +--max-input-token-len: Maximum length of input token IDs. Default value: 2048. +--truncation: Whether to perform parameter rationality check and interception. false: check, true: no check. Default value: false. +--world-size: Number of cards used for inference. In multi-node inference scenarios, this value is invalid, and worldSize is calculated based on the ranktable. Default value: 4. +--template-type: Inference type. Standard: PD mixed deployment scenario, Prefill requests and Decode requests are batched separately. Mix: Splitfuse feature-related parameter, Prefill requests and Decode requests can be batched together. This field configuration does not take effect in PD separation scenarios. Default value: "Standard". +--max-preempt-count: The upper limit of the maximum preemptible requests per batch, i.e., limits the number of requests that can be preempted in one round of scheduling. The maximum limit is maxBatchSize. A value greater than 0 indicates that the preemptible function is enabled. Default value: 0. +--support-select-batch: Batch selection strategy. This field does not take effect in PD separation scenarios. false: indicates that during each round of scheduling, Prefill stage requests are prioritized for scheduling and execution. true: indicates that during each round of scheduling, the scheduling and execution order of Prefill and Decode stage requests is adaptively adjusted based on the current number of Prefill and Decode requests. Default value: false. +--npu-mem-size: The upper limit of the size that can be used to apply for KV Cache in a single NPU. Default value: -1. +--max-prefill-batch-size: Maximum prefill batch size. Default value: 50. +--ip: IP address bound to the business RESTful interface provided by EndPoint. Default value: "127.0.0.1". +--port: Port number bound to the business RESTful interface provided by EndPoint. Default value: 1025. +--management-ip: IP address bound to the management RESTful interface provided by EndPoint. Default value: "127.0.0.2". +--management-port: Port number bound to the management interface (see Table 1 for management interface) provided by EndPoint. Default value: 1026. +--metrics-port: Port number of the service management metrics interface (Prometheus format). Default value: 1027. +--ms-sched-host: Scheduler node IP address. Default value: 127.0.0.1. +--ms-sched-port: Scheduler node service port. Default value: 8090. ``` View logs: diff --git a/docs/mindformers/docs/source_en/guide/evaluation.md b/docs/mindformers/docs/source_en/guide/evaluation.md index 6f5992a069..76ec631926 100644 --- a/docs/mindformers/docs/source_en/guide/evaluation.md +++ b/docs/mindformers/docs/source_en/guide/evaluation.md @@ -357,6 +357,7 @@ The following table lists the parameters of the script of `run_harness.sh`: | `--model_args` | str | Model and evaluation parameters. For details, see MindSpore Transformers model parameters. | Yes | | `--tasks` | str | Dataset name. Multiple datasets can be specified and separated by commas (,). | Yes | | `--batch_size` | int | Number of batch processing samples. | No | +| `--help` | | Display help information and exit. | No | The following table lists the parameters of `model_args`: diff --git a/docs/mindformers/docs/source_zh_cn/feature/ckpt.md b/docs/mindformers/docs/source_zh_cn/feature/ckpt.md index 1af66348e8..4eb92474c7 100644 --- a/docs/mindformers/docs/source_zh_cn/feature/ckpt.md +++ b/docs/mindformers/docs/source_zh_cn/feature/ckpt.md @@ -22,7 +22,7 @@ MindSpore Transformers提供了统一的权重转换工具,能够将模型权 要进行权重转换,首先请将待转换模型的HuggingFace仓库完整克隆到本地,然后执行`mindformers/convert_weight.py`脚本。该脚本能够自动将HuggingFace的模型权重文件转换为适用于MindSpore Transformers的权重文件。如若希望将MindSpore Transformers权重转为HuggingFace权重,请将`reversed`设置为`True`。 ```shell -python convert_weight.py [-h] --model MODEL [--reversed] --input_path INPUT_PATH --output_path OUTPUT_PATH [--dtype DTYPE] [--n_head N_HEAD] [--hidden_size HIDDEN_SIZE] [--layers LAYERS] [--is_pretrain IS_PRETRAIN] [--telechat_type TELECHAT_TYPE] +python convert_weight.py [-h] --model MODEL [--reversed] --input_path INPUT_PATH --output_path OUTPUT_PATH [--dtype DTYPE] [--telechat_type TELECHAT_TYPE] ``` #### 参数说明 @@ -32,10 +32,6 @@ python convert_weight.py [-h] --model MODEL [--reversed] --input_path INPUT_PATH - input_path:HuggingFace权重文件夹的路径,指向已下载的权重文件。 - output_path:转换后MindSpore Transformers权重文件的保存路径。 - dtype:转换后的权重数据类型。 -- n_head:只对BLOOM模型生效,使用`bloom_560m`时请设为`16`,使用`bloom_7.1b`时请设为`32`。 -- hidden_size:只对BLOOM模型生效,使用`bloom_560m`时请设为`1024`,使用`bloom_7.1b`时请设为`4096`。 -- layers:只对GPT2和WizardCoder模型生效,指定模型被转换的层数。 -- is_pretrain:只对Swin模型生效,转换预训练权重。 - telechat_type:只对TeleChat模型生效,TeleChat模型的版本。 ### 转换示例 @@ -395,6 +391,26 @@ bash transform_checkpoint.sh \ 16 2 ``` +**参数说明** + +- transform_checkpoint.py转换使用参数 + + | 参数名称 | 说明 | + |-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| + | src_checkpoint | 源权重的绝对路径或文件夹路径。
- 如果是**完整权重**,则填写**绝对路径**;
- 如果是**分布式权重**,则填写**文件夹路径**,分布式权重须按照`model_dir/rank_x/xxx.ckpt`格式存放,文件夹路径填写为`model_dir`。
**如果rank_x文件夹下存在多个ckpt,将会使用文件名默认排序最后的ckpt文件用于转换。** | + | src_strategy | 源权重对应的分布式策略文件路径。
- 如果是完整权重,则**不填写**;
- 如果是分布式权重,且使用了流水线并行,则填写**合并的策略文件路径**或**分布式策略文件夹路径**;
- 如果是分布式权重,且未使用流水线并行,则填写任一**ckpt_strategy_rank_x.ckpt**路径; | + | dst_checkpoint_dir | 保存目标权重的文件夹路径。 | + | dst_strategy | 目标权重对应的分布式策略文件路径。
- 如果是完整权重,则**不填写**;
- 如果是分布式权重,且使用了流水线并行,则填写**合并的策略文件路径**或**分布式策略文件夹路径**;
- 如果是分布式权重,且未使用流水线并行,则填写任一**ckpt_strategy_rank_x.ckpt**路径; | + | prefix | 目标权重保存的前缀名,权重保存为”{prefix}rank_x.ckpt”,默认”checkpoint_”。 | + | rank_id | 当前转换进程的rank_id。单进程无需使用。 | + | world_size | 目标权重的切片总数,一般等于dp \* mp \* pp。单进程无需使用。 | + | transform_process_num | 离线权重转换使用的进程数,默认为1。
- 如果process_num = 1,使用**单进程转换**;
- 如果process_num > 1,使用**多进程转换**,比如转换的目标权重为8卡分布式权重,process_num=2时,会启动两个进程分别负责rank_0/1/2/3和rank_4/5/6/7切片权重的转换; | + | transform_by_rank | 转换时是否启动mindspore.transform_checkpoint_by_rank。当transform_process_num>1时,它将自动设置为True。 | + +- transform_checkpoint.sh转换使用参数 + + 参数说明参考transform_checkpoint.py转换使用参数。参数顺序为src_checkpoint、src_strategy、dst_checkpoint_dir、dst_strategy、world_size、transform_process_num、prefix。 + - **复制权重到其他节点** 将转换得到的分布式权重分别复制到各自节点。0节点只需要 `rank_0` 到 `rank_7` 的切片权重,1节点只需要 `rank_8` 到 `rank_15` 的切片权重。 @@ -446,9 +462,11 @@ python mindformers/tools/transform_ckpt_lora.py \ **注意**:如果策略文件夹下已存在 `merged_ckpt_strategy.ckpt` 且仍传入文件夹路径,脚本会首先删除旧的 `merged_ckpt_strategy.ckpt`,再合并生成新的 `merged_ckpt_strategy.ckpt` 以用于权重转换。因此,请确保该文件夹具有足够的写入权限,否则操作将报错。 - **src_ckpt_path_or_dir**:源权重的路径。如果为分布式权重,请填写源权重所在文件夹的路径,源权重应按 `model_dir/rank_x/xxx.ckpt` 格式存放,并将文件夹路径填写为 `model_dir`。若源权重为完整权重,则填写完整权重的绝对路径。 +- **dst_ckpt_strategy**:目标权重对应的分布式策略文件路径。 - **dst_ckpt_dir**:目标权重的保存路径,需为自定义的空文件夹路径。目标权重将按 `model_dir/rank_x/xxx.ckpt` 格式保存。 - **prefix**:目标权重文件的命名前缀,默认值为 "checkpoint_",即目标权重将按照 `model_dir/rank_x/checkpoint_x.ckpt` 格式保存。 - **lora_scaling**:LoRA 权重的合并系数,默认为 `lora_alpha/lora_rank`,这两个参数即为 LoRA 模型配置时的参数,需自行计算。 +- **save_format**:目标权重的保存格式。默认为 `ckpt`。 #### 示例 diff --git a/docs/mindformers/docs/source_zh_cn/guide/deployment.md b/docs/mindformers/docs/source_zh_cn/guide/deployment.md index 43a7f61a70..aa4c5c5582 100644 --- a/docs/mindformers/docs/source_zh_cn/guide/deployment.md +++ b/docs/mindformers/docs/source_zh_cn/guide/deployment.md @@ -186,8 +186,25 @@ bash run_mindie.sh --model-name xxx --model-path /path/to/model # 参数说明 --model-name: 必传,设置MindIE后端名称 ---model-path:必传,设置模型文件夹路径,如/path/to/mf_model/qwen1_5_72b +--model-path: 必传,设置模型文件夹路径,如/path/to/mf_model/qwen1_5_72b --help : 脚本使用说明 +--max-seq-len: 最大序列长度。默认值:2560。 +--max-iter-times: 模型全局最大输出长度。默认值:512。 +--max-input-token-len: 输入token id最大长度。默认值:2048。 +--truncation: 是否进行参数合理化校验拦截。false:校验,true:不校验。默认值:false。 +--world-size: 启用几张卡推理。多机推理场景下该值无效,worldSize根据ranktable计算获得。默认值:4。 +--template-type: 推理类型。Standard:PD混部场景,Prefill请求和Decode请求各自组batch。Mix:Splitfuse特性相关参数,Prefill请求和Decode请求可以一起组batch。PD分离场景下该字段配置不生效。默认值:"Standard"。 +--max-preempt-count: 每一批次最大可抢占请求的上限,即限制一轮调度最多抢占请求的数量,最大上限为maxBatchSize,取值大于0则表示开启可抢占功能。默认值:0。 +--support-select-batch: batch选择策略。PD分离场景下该字段不生效。false:表示每一轮调度时,优先调度和执行Prefill阶段的请求。true:表示每一轮调度时,根据当前Prefill与Decode请求的数量,自适应调整Prefill和Decode阶段请求调度和执行的先后顺序。默认值:false。 +--npu-mem-size: 单个NPU中可以用来申请KV Cache的size上限。默认值:-1。 +--max-prefill-batch-size: 最大prefill batch size。默认值:50。 +--ip: EndPoint提供的业务面RESTful接口绑定的IP地址。默认值:"127.0.0.1"。 +--port: EndPoint提供的业务面RESTful接口绑定的端口号。默认值:1025。 +--management-ip: EndPoint提供的管理面RESTful接口绑定的IP地址。默认值:"127.0.0.2"。 +--management-port: EndPoint提供的管理面(管理面接口请参见表1)接口绑定的端口号。默认值:1026。 +--metrics-port: 服务管控指标接口(普罗格式)端口号。默认值:1027。 +--ms-sched-host: scheduler节点ip地址。默认值:127.0.0.1。 +--ms-sched-port: scheduler节点服务端口。默认值:8090。 ``` 查看日志: diff --git a/docs/mindformers/docs/source_zh_cn/guide/evaluation.md b/docs/mindformers/docs/source_zh_cn/guide/evaluation.md index a4af37d770..4e10825c22 100644 --- a/docs/mindformers/docs/source_zh_cn/guide/evaluation.md +++ b/docs/mindformers/docs/source_zh_cn/guide/evaluation.md @@ -361,6 +361,7 @@ run_harness.sh脚本参数配置如下表: | `--model_args` | str | 模型及评估相关参数,见下方模型参数介绍 | 是 | | `--tasks` | str | 数据集名称。可传入多个数据集,使用逗号(,)分隔 | 是 | | `--batch_size` | int | 批处理样本数 | 否 | +| `--help` | | 显示帮助信息并退出 | 否 | 其中,model_args参数配置如下表: -- Gitee