1 Star 0 Fork 0

占书生/ms-swift

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
.dev_scripts
.github
asset
docs
examples
app
custom
deploy
eval
export
infer
notebook
sampler
train
agent
all_to_all
base_to_chat
embedding
train_emb.sh
train_gme.sh
full
grpo
lazy_tokenize
liger
long_text
megatron
moe
multi-gpu
multi-node
multimodal
optimizer
packing
padding_free
plugins
predict_with_generate
pretrain
qlora
reranker
rft
rlhf
seq_cls
streaming
think_model
tuners
infer.sh
lora_sft.sh
README.md
requirements
scripts
swift
tests
.gitignore
.pre-commit-config.yaml
.pre-commit-config_local.yaml
CODE_OF_CONDUCT.md
CONTRIBUTING.md
CONTRIBUTING_CN.md
LICENSE
MANIFEST.in
Makefile
README.md
README_CN.md
requirements.txt
setup.cfg
setup.py
克隆/下载
train_emb.sh 972 Bytes
一键复制 编辑 原始数据 按行查看 历史
Jintao 提交于 3个月前 . fix gc_kwargs (#4591)
nproc_per_node=2
# 2*12G
# losses: plugin/loss.py
# data format: docs/source_en/Customization/Custom-dataset.md
# --use_chat_template must be false to use generation template
# --dataloader_drop_last must be true or eval gather will throw error
# --model iic/gte-modernbert-base iic/gte_Qwen2-7B-instruct also supported
CUDA_VISIBLE_DEVICES=0,1 \
NPROC_PER_NODE=$nproc_per_node \
swift sft \
--model Qwen/Qwen3-Embedding-0.6B \
--task_type embedding \
--model_type qwen3_emb \
--train_type full \
--dataset sentence-transformers/stsb:positive \
--split_dataset_ratio 0.05 \
--eval_strategy steps \
--output_dir output \
--save_steps 50 \
--eval_steps 50 \
--num_train_epochs 5 \
--per_device_train_batch_size 4 \
--per_device_eval_batch_size 4 \
--gradient_accumulation_steps 4 \
--learning_rate 6e-6 \
--loss_type infonce \
--label_names labels \
--dataloader_drop_last true \
--deepspeed zero2
Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/MIEAPP/ms-swift.git
git@gitee.com:MIEAPP/ms-swift.git
MIEAPP
ms-swift
ms-swift
main

搜索帮助