登录
注册
开源
企业版
高校版
搜索
帮助中心
使用条款
关于我们
开源
企业版
高校版
私有云
模力方舟
登录
注册
代码拉取完成,页面将自动刷新
开源项目
>
人工智能
>
AI-人工智能
&&
捐赠
捐赠前请先登录
取消
前往登录
扫描微信二维码支付
取消
支付完成
支付提示
将跳转至支付宝完成支付
确定
取消
Watch
不关注
关注所有动态
仅关注版本发行动态
关注但不提醒动态
100
Star
1.3K
Fork
916
GVP
MindSpore
/
mindformers
代码
Issues
107
Pull Requests
132
Wiki
统计
流水线
服务
质量分析
Jenkins for Gitee
腾讯云托管
腾讯云 Serverless
悬镜安全
阿里云 SAE
Codeblitz
SBOM
我知道了,不再自动展开
更新失败,请稍后重试!
移除标识
内容风险标识
本任务被
标识为内容中包含有代码安全 Bug 、隐私泄露等敏感信息,仓库外成员不可访问
qwen2-7b推理精度问题
DONE
#IAK8II
Question
liyongwen
创建于
2024-08-15 14:27
## MF与HF推理生成文本无法对齐 ### MF推理 ``` input = "帮助我制定一份去上海的旅游攻略" input_id = tokenizer(input, return_tensor="ms") outputs = model.generate(**input_id, do_sample=False, max_length=8192, max_new_tokens=None, num_beams=1, temperature=1.0, top_l=1, top_p=0.8, repetition_penalty=1) result = tokenizer.decode(outputs[0], skip_special_tokens=True) print(result)  ``` ### HF推理 ``` from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("qwen2-7b", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("qwen2-7b", device_map="cuda:0", trust_remote_code=True).eval() inputs = tokenizer('帮助我制定一份去上海的旅游攻略', return_tensors='pt') print(inputs) inputs = inputs.to(model.device) pred = model.generate(**inputs, do_sample=False, max_length=8192, max_new_tokens=None, num_beams=1, temperature=1.0, top_k=1, top_p=0.8, repetition_penalty=1) print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True)) ```  ### 推理结果不一致 ## 关闭use_past参数后,推理生成乱码 ### 推理脚本 ``` import importlib.util import os from pathlib import Path from mindformers import AutoModel, build_context, MindFormerConfig def load_class_from_file(module_path, class_name): module_name = os.path.splitext(os.path.basename(module_path))[0] spec = importlib.util.spec_from_file_location(module_name, module_path) module = importlib.util.module_from_spec(spec) spec.loader.exec_module(module) return getattr(module, class_name) config = MindFormerConfig('qwen2-7b-mf/predict_qwen2_7b_instruct.yaml') config.model.model_config.seq_length = config.processor.tokenizer.model_max_length config.model.model_config.batch_size = 1 config.model.model_config.use_past = False build_context(config) model = AutoModel.from_config(config) tokenizer_kwargs = dict(config.processor.tokenizer) tokenizer_type = tokenizer_kwargs.pop('type') tokenizer_py_path = [str(file.resolve()) for file in Path('qwen2-7b-mf/').glob('*tokenizer*.py')] assert len(tokenizer_py_path) == 1 tokenizer_class = load_class_from_file(tokenizer_py_path[0], tokenizer_type) tokenizer = tokenizer_class( **tokenizer_kwargs ) inputs = "Question: Peggy is moving and is looking to get rid of her record collection. Sammy says that he will buy all of them for 4 dollars each. Bryan is only interested in half of the records but will offer 6 dollars each for the half that he is interested in and 1 dollar each for the remaining half that he is not interested in with the hopes that he can resell them in bulk later. If Peggy has 200 records, what is the difference in profit between Sammy versus Bryan's deal?\nAnswer: Sammy is offering to take the whole collection of 200 records and pay Peggy 4 dollars each for them which would net Peggy 200 * 4=<<200*4=800>>800 dollars for her entire record collection.\nBryan is willing to buy Peggy's entire record collection but at two different price points, half at one point and half at another. Half of Peggy's record collection is 200/2=<<200/2=100>>100, which means that 100 records will sell for one price and 100 records will sell for another price.\nBryan is willing to pay more for the half of the record collection that he is interested in so Peggy would net 100 * 6=<<100*6=600>>600 dollars for the first half of her record collection.\nFor the half of the collection that Bryan is just planning on reselling at a later date, he is willing to offer Peggy 100 *1=<<100*1=100>>100 dollars to take off of her hands.\nIn total Bryan is willing to offer Peggy 600+100=<<600+100=700>>700 dollars for her entire record collection.\nIf Sammy is offering 800 dollars to buy Peggy's entire record collection and Bryan is offering 700 dollars for Peggy's entire record collection, then Peggy's net profit would be 800-700=<<800-700=100>>100 dollars more by taking Sammy's deal instead of Bryan's deal.\n#### 100\n\nQuestion: Randy just turned 12 and started playing the piano. His friend Sheila told him about the 10,000-hour rule which says, after 10,000 hours of practice, you become an expert or master in your field. If Randy wants to become a piano expert before he is 20, how many hours a day will he need to practice if he practices every day, Monday – Friday, and takes two weeks off for vacation each year?\nAnswer: Randy has 20 – 12 = <<20-12=8>>8 years until he is 20.\nHe must practice 10,000 hours / 8 years = <<10000/8=1250>>1,250 hours a year to become an expert.\nThere are 52 weeks in a year – 2 weeks of vacation Randy plans to take = <<52-2=50>>50 weeks of practice for Randy.\nRandy will practice Monday – Friday, which is 5 days a week, so 50 weeks x 5 days = <<50*5=250>>250 days of practice each year.\nRandy will need to practice 1250 hours / 250 days = <<1250/250=5>>5 hours each day.\n#### 5\n\nQuestion: Ben will receive a bonus of $1496. He chooses to allocate this amount as follows: 1/22 for the kitchen, 1/4 for holidays and 1/8 for Christmas gifts for his 3 children. How much money will he still have left after these expenses?\nAnswer: Ben's spending for the kitchen is $1496 x 1/22 = $<<1496*1/22=68>>68.\nBen's spending for holidays is $1496 x 1/4 = $<<1496*1/4=374>>374.\nBen's spending for children's gifts is $1496 x 1/8 = $<<1496*1/8=187>>187.\nThe total amount spent is $68 + $374 + $187 = $<<68+374+187=629>>629.\nSo, Ben still has $1496 - $629 = $<<1496-629=867>>867 left.\n#### 867\n\nQuestion: Natalie bought some food for a party she is organizing. She bought two cheesecakes, an apple pie, and a six-pack of muffins. The six-pack of muffins was two times more expensive than the cheesecake, and one cheesecake was only 25% cheaper than the apple pie. If the apple pie cost $12, how much did Natalie pay for all her shopping?\nAnswer: One cheesecake was 30% cheaper than the apple pie, which means, it was 25/100 * 12 = $3 cheaper.\nSo for one cheesecake, Natalie needed to pay 12 - 3 = $<<12-3=9>>9.\nThe six-pack of muffins was two times more expensive than the cheesecake, which means its price was 9 * 2 = $<<9*2=18>>18.\nSo for two cheesecakes, Natalie paid also 9 * 2 = $<<9*2=18>>18.\nSo for all her shopping she paid 18 + 18 + 12 = $<<18+18+12=48>>48.\n#### 48\n\nQuestion: Andy is running late. School starts at 8:00 AM and it normally takes him 30 minutes to get there, but today he had to stop for 3 minutes each at 4 red lights and wait 10 minutes to get past construction. If he left his house at 7:15, how many minutes late will he be?\nAnswer: First find how many minute Andy had to get to school when he left his house: 8:00 AM - 7:15 AM = 45 minutes\nThen find the total time he spent waiting at red lights: 3 minutes/light * 4 lights = <<3*4=12>>12 minutes\nNow add the normal travel time, red light time, and construction wait time to find Andy's total travel time: 30 minutes + 12 minutes + 10 minutes = <<30+12+10=52>>52 minutes\nNow subtract the amount of time Andy had when he left his house from that number to find how many minute late he is: 52 minutes - 45 minutes = <<52-45=7>>7 minutes\n#### 7\n\nQuestion: A football team has 105 members. There are twice as many players on the offense as there is on the defense. There is half the number of players on the special teams as there is on the defense. How many players are on the defense?\nAnswer:" inputs_ids = tokenizer(inputs, truncation=False, padding="longest", return_tensors="ms")["input_ids"] import pdb pdb.set_trace() outputs = model.generate(inputs_ids.tolist(), max_length=1868, do_sample=False, pad_token_id=tokenizer.pad_token_id, use_cache=True) print(tokenizer.decode(outputs)) ``` ### 推理结果 
## MF与HF推理生成文本无法对齐 ### MF推理 ``` input = "帮助我制定一份去上海的旅游攻略" input_id = tokenizer(input, return_tensor="ms") outputs = model.generate(**input_id, do_sample=False, max_length=8192, max_new_tokens=None, num_beams=1, temperature=1.0, top_l=1, top_p=0.8, repetition_penalty=1) result = tokenizer.decode(outputs[0], skip_special_tokens=True) print(result)  ``` ### HF推理 ``` from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("qwen2-7b", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("qwen2-7b", device_map="cuda:0", trust_remote_code=True).eval() inputs = tokenizer('帮助我制定一份去上海的旅游攻略', return_tensors='pt') print(inputs) inputs = inputs.to(model.device) pred = model.generate(**inputs, do_sample=False, max_length=8192, max_new_tokens=None, num_beams=1, temperature=1.0, top_k=1, top_p=0.8, repetition_penalty=1) print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True)) ```  ### 推理结果不一致 ## 关闭use_past参数后,推理生成乱码 ### 推理脚本 ``` import importlib.util import os from pathlib import Path from mindformers import AutoModel, build_context, MindFormerConfig def load_class_from_file(module_path, class_name): module_name = os.path.splitext(os.path.basename(module_path))[0] spec = importlib.util.spec_from_file_location(module_name, module_path) module = importlib.util.module_from_spec(spec) spec.loader.exec_module(module) return getattr(module, class_name) config = MindFormerConfig('qwen2-7b-mf/predict_qwen2_7b_instruct.yaml') config.model.model_config.seq_length = config.processor.tokenizer.model_max_length config.model.model_config.batch_size = 1 config.model.model_config.use_past = False build_context(config) model = AutoModel.from_config(config) tokenizer_kwargs = dict(config.processor.tokenizer) tokenizer_type = tokenizer_kwargs.pop('type') tokenizer_py_path = [str(file.resolve()) for file in Path('qwen2-7b-mf/').glob('*tokenizer*.py')] assert len(tokenizer_py_path) == 1 tokenizer_class = load_class_from_file(tokenizer_py_path[0], tokenizer_type) tokenizer = tokenizer_class( **tokenizer_kwargs ) inputs = "Question: Peggy is moving and is looking to get rid of her record collection. Sammy says that he will buy all of them for 4 dollars each. Bryan is only interested in half of the records but will offer 6 dollars each for the half that he is interested in and 1 dollar each for the remaining half that he is not interested in with the hopes that he can resell them in bulk later. If Peggy has 200 records, what is the difference in profit between Sammy versus Bryan's deal?\nAnswer: Sammy is offering to take the whole collection of 200 records and pay Peggy 4 dollars each for them which would net Peggy 200 * 4=<<200*4=800>>800 dollars for her entire record collection.\nBryan is willing to buy Peggy's entire record collection but at two different price points, half at one point and half at another. Half of Peggy's record collection is 200/2=<<200/2=100>>100, which means that 100 records will sell for one price and 100 records will sell for another price.\nBryan is willing to pay more for the half of the record collection that he is interested in so Peggy would net 100 * 6=<<100*6=600>>600 dollars for the first half of her record collection.\nFor the half of the collection that Bryan is just planning on reselling at a later date, he is willing to offer Peggy 100 *1=<<100*1=100>>100 dollars to take off of her hands.\nIn total Bryan is willing to offer Peggy 600+100=<<600+100=700>>700 dollars for her entire record collection.\nIf Sammy is offering 800 dollars to buy Peggy's entire record collection and Bryan is offering 700 dollars for Peggy's entire record collection, then Peggy's net profit would be 800-700=<<800-700=100>>100 dollars more by taking Sammy's deal instead of Bryan's deal.\n#### 100\n\nQuestion: Randy just turned 12 and started playing the piano. His friend Sheila told him about the 10,000-hour rule which says, after 10,000 hours of practice, you become an expert or master in your field. If Randy wants to become a piano expert before he is 20, how many hours a day will he need to practice if he practices every day, Monday – Friday, and takes two weeks off for vacation each year?\nAnswer: Randy has 20 – 12 = <<20-12=8>>8 years until he is 20.\nHe must practice 10,000 hours / 8 years = <<10000/8=1250>>1,250 hours a year to become an expert.\nThere are 52 weeks in a year – 2 weeks of vacation Randy plans to take = <<52-2=50>>50 weeks of practice for Randy.\nRandy will practice Monday – Friday, which is 5 days a week, so 50 weeks x 5 days = <<50*5=250>>250 days of practice each year.\nRandy will need to practice 1250 hours / 250 days = <<1250/250=5>>5 hours each day.\n#### 5\n\nQuestion: Ben will receive a bonus of $1496. He chooses to allocate this amount as follows: 1/22 for the kitchen, 1/4 for holidays and 1/8 for Christmas gifts for his 3 children. How much money will he still have left after these expenses?\nAnswer: Ben's spending for the kitchen is $1496 x 1/22 = $<<1496*1/22=68>>68.\nBen's spending for holidays is $1496 x 1/4 = $<<1496*1/4=374>>374.\nBen's spending for children's gifts is $1496 x 1/8 = $<<1496*1/8=187>>187.\nThe total amount spent is $68 + $374 + $187 = $<<68+374+187=629>>629.\nSo, Ben still has $1496 - $629 = $<<1496-629=867>>867 left.\n#### 867\n\nQuestion: Natalie bought some food for a party she is organizing. She bought two cheesecakes, an apple pie, and a six-pack of muffins. The six-pack of muffins was two times more expensive than the cheesecake, and one cheesecake was only 25% cheaper than the apple pie. If the apple pie cost $12, how much did Natalie pay for all her shopping?\nAnswer: One cheesecake was 30% cheaper than the apple pie, which means, it was 25/100 * 12 = $3 cheaper.\nSo for one cheesecake, Natalie needed to pay 12 - 3 = $<<12-3=9>>9.\nThe six-pack of muffins was two times more expensive than the cheesecake, which means its price was 9 * 2 = $<<9*2=18>>18.\nSo for two cheesecakes, Natalie paid also 9 * 2 = $<<9*2=18>>18.\nSo for all her shopping she paid 18 + 18 + 12 = $<<18+18+12=48>>48.\n#### 48\n\nQuestion: Andy is running late. School starts at 8:00 AM and it normally takes him 30 minutes to get there, but today he had to stop for 3 minutes each at 4 red lights and wait 10 minutes to get past construction. If he left his house at 7:15, how many minutes late will he be?\nAnswer: First find how many minute Andy had to get to school when he left his house: 8:00 AM - 7:15 AM = 45 minutes\nThen find the total time he spent waiting at red lights: 3 minutes/light * 4 lights = <<3*4=12>>12 minutes\nNow add the normal travel time, red light time, and construction wait time to find Andy's total travel time: 30 minutes + 12 minutes + 10 minutes = <<30+12+10=52>>52 minutes\nNow subtract the amount of time Andy had when he left his house from that number to find how many minute late he is: 52 minutes - 45 minutes = <<52-45=7>>7 minutes\n#### 7\n\nQuestion: A football team has 105 members. There are twice as many players on the offense as there is on the defense. There is half the number of players on the special teams as there is on the defense. How many players are on the defense?\nAnswer:" inputs_ids = tokenizer(inputs, truncation=False, padding="longest", return_tensors="ms")["input_ids"] import pdb pdb.set_trace() outputs = model.generate(inputs_ids.tolist(), max_length=1868, do_sample=False, pad_token_id=tokenizer.pad_token_id, use_cache=True) print(tokenizer.decode(outputs)) ``` ### 推理结果 
评论 (
2
)
登录
后才可以发表评论
状态
DONE
TODO
ACCEPTED
WIP
VALIDATION
DONE
CLOSED
REJECTED
负责人
未设置
hezequan
hezequan054
负责人
协作者
+负责人
+协作者
标签
未设置
项目
未立项任务
未立项任务
里程碑
未关联里程碑
未关联里程碑
Pull Requests
未关联
未关联
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
未关联
分支 (39)
标签 (22)
master
r1.6.0
r1.7.0
r1.7.0-beta3
br_feature_infer
r1.7.0-beta1
br_infer_boom
revert-3cfdd0a
dev
br_infer_deepseek_os
r1.5.0
br_feature_checkpoint
br_feature_infer_300iduo
br_feature_mcore
r1.6.0-beta1
br_infer_deepseek_ep
br_feature_rl_dpo
r1.3.0
r1.3.1
r1.4.0-beta2
r1.4.0-beta1
r1.5.0-beta1
r1.2.0
r1.1.0
r1.1.0-infer
r1.1.rc1
r1.0
kbk-infer
r1.0.a
r0.8
r0.7
r0.6.1_demo
r0.6
0.6rc1
r0.3
r0.2
v0.1.2
v0.1.1
v0.1.0
v1.7.0-beta3
v1.7.0-beta2
v1.6.0
v1.6.0-beta1
v1.5.0
v1.5.0-beta2
v1.5.0-beta1
v1.4.0-beta2
v1.3.2
v1.3.1-beta1
v1.4.0-beta1
v1.3.0
v1.2.0
v1.1.0
v1.0.2
v1.0.1
v1.0.0
v0.6.0
v0.3
v0.2_rc
v0.1.1
v0.1.0
开始日期   -   截止日期
-
置顶选项
不置顶
置顶等级:高
置顶等级:中
置顶等级:低
优先级
不指定
严重
主要
次要
不重要
预计工期
(小时)
参与者(1)
Python
1
https://gitee.com/mindspore/mindformers.git
git@gitee.com:mindspore/mindformers.git
mindspore
mindformers
mindformers
点此查找更多帮助
搜索帮助
Git 命令在线学习
如何在 Gitee 导入 GitHub 仓库
Git 仓库基础操作
企业版和社区版功能对比
SSH 公钥设置
如何处理代码冲突
仓库体积过大,如何减小?
如何找回被删除的仓库数据
Gitee 产品配额说明
GitHub仓库快速导入Gitee及同步更新
什么是 Release(发行版)
将 PHP 项目自动发布到 packagist.org
评论
仓库举报
回到顶部
登录提示
该操作需登录 Gitee 帐号,请先登录后再操作。
立即登录
没有帐号,去注册