1 Star 0 Fork 0

Hugging Face 模型镜像 / falcon-7b-sft-top1-696

Create your Gitee Account
Explore and code with more than 12 million developers,Free private repositories !:)
Sign up
This repository doesn't specify license. Please pay attention to the specific project description and its upstream code dependency when using it.
Clone or Download
contribute
Sync branch
Cancel
Notice: Creating folder will generate an empty file .keep, because not support in Git
Loading...
README
license language tags pipeline_tag widget datasets library_name
apache-2.0
en
de
es
fr
sft
text-generation
text
What is a meme, and what's the history behind this word?
text
What's the Earth total population
text
Write a story about future of AI development
OpenAssistant/oasst1
transformers

Open-Assistant Falcon 7B SFT OASST-TOP1 Model

This model is a fine-tuning of TII's Falcon 7B LLM. It was trained with 11,123 top-1 (high-quality) demonstrations of the OASST data set (exported on June 2, 2023) with a batch size of 128 for 8 epochs with LIMA style dropout (p=0.2) and a context-length of 2048 tokens.

Model Details

Prompting

Two special tokens are used to mark the beginning of user and assistant turns: <|prompter|> and <|assistant|>. Each turn ends with a <|endoftext|> token.

Input prompt example:

<|prompter|>What is a meme, and what's the history behind this word?<|endoftext|><|assistant|>

The input ends with the <|assistant|> token to signal that the model should start generating the assistant reply.

Sample Code

from transformers import AutoTokenizer
import transformers
import torch

model = "OpenAssistant/falcon-7b-sft-top1-696"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)

input_text="<|prompter|>What is a meme, and what's the history behind this word?<|endoftext|><|assistant|>"

sequences = pipeline(
    input_text,
    max_length=500,
    do_sample=True,
    return_full_text=False,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Configuration Details

Model:

falcon-7b:
  dtype: bf16
  log_dir: "falcon_log_7b"
  learning_rate: 1e-5
  model_name: "tiiuae/falcon-7b"
  deepspeed_config: configs/zero_config.json
  output_dir: falcon
  weight_decay: 0.0
  max_length: 2048
  save_strategy: steps
  eval_steps: 80
  save_steps: 80
  warmup_steps: 20
  gradient_checkpointing: true
  gradient_accumulation_steps: 4
  per_device_train_batch_size: 4
  per_device_eval_batch_size: 8
  num_train_epochs: 8
  save_total_limit: 4
  residual_dropout: 0.2
  residual_dropout_lima: true

Dataset:

oasst-top1:
  # oasst_export: 11123 (100.00%)
  datasets:
    - oasst_export:
        lang: "bg,ca,cs,da,de,en,es,fr,hr,hu,it,nl,pl,pt,ro,ru,sl,sr,sv,uk" # sft-8.0
        input_file_path: 2023-06-02_oasst_all_labels.jsonl.gz
        val_split: 0.05
        top_k: 1

Train command:

deepspeed trainer_sft.py --configs defaults falcon-7b oasst-top1 --cache_dir <data_cache_dir> --output_dir <output_path> --deepspeed

Export command:

python export_model.py --dtype bf16 --hf_repo_name OpenAssistant/falcon-7b-sft-top1 --trust_remote_code --auth_token <auth_token> <output_path> --max_shard_size 2GB

Empty file

About

falcon-7b-sft-top1-696是一个用于生成式文本任务的预训练模型,适用于对话系统、情感分析和文本生成等任务。 expand collapse
Cancel

Releases

No release

Contributors

All

Activities

Load More
can not load any more
1
https://gitee.com/hf-models/falcon-7b-sft-top1-696.git
git@gitee.com:hf-models/falcon-7b-sft-top1-696.git
hf-models
falcon-7b-sft-top1-696
falcon-7b-sft-top1-696
main

Search

53164aa7 5694891 3bd8fe86 5694891