🤗🦙Welcome! This repository contains minimal recipes to get started with Llama 3.1 quickly.
This repository is WIP so that you might see considerable changes in the coming days.
Would you like to run inference of the Llama 3.1 models locally? So do we! The memory requirements depend on the model size and the precision of the weights. Here's a table showing the approximate memory needed for different configurations:
Model Size | FP16 | FP8 | INT4 (AWQ/GPTQ/bnb) |
8B | 16 GB | 8 GB | 4 GB |
70B | 140 GB | 70 GB | 35 GB |
405B | 810 GB | 405 GB | 203 GB |
Note: These are estimated values and may vary based on specific implementation details and optimizations.
Here are some notebooks to help you started:
Are these models too large for you to run at home? Would you like to experiment with Llama 405B? Try out the following examples!
In addition to the generative models, Meta released two new models: Llama Guard 3 and Prompt Guard. Prompt Guard is a small classifier that detects prompt injections and jailbreaks. Llama Guard 3 is a safeguard model that can classify LLM inputs and generations. Learn how to use them as done in the following notebooks:
trl
and QLoRAdistilabel
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。