A simple local text generator for ComfyUI using ExLlamaV2.
Clone the repository to custom_nodes
and install the requirements:
git clone https://github.com/Zuellni/ComfyUI-ExLlama-Nodes custom_nodes/ComfyUI-ExLlamaV2-Nodes
pip install -r custom_nodes/ComfyUI-ExLlamaV2-Nodes/requirements.txt
Use wheels for ExLlamaV2 and FlashAttention on Windows:
pip install exllamav2-X.X.X+cuXXX.torch2.X.X-cp3XX-cp3XX-win_amd64.whl
pip install flash_attn-X.X.X+cuXXX.torch2.X.X-cp3XX-cp3XX-win_amd64.whl
Only EXL2, 4-bit GPTQ and unquantized models are supported. You can find them on Hugging Face.
To use a model with the nodes, you should clone its repository with git
or manually download all the files and place them in models/llm
.
For example, if you want to download the 6-bit Llama-3-8B-Instruct, use the following command:
git install lfs
git clone https://huggingface.co/turboderp/Llama-3-8B-Instruct-exl2 -b 6.0bpw models/llm/Llama-3-8B-Instruct-exl2-6.0bpw
[!TIP] You can add your own
llm
path to the extra_model_paths.yaml file and put the models there instead.
Loader | Loads models from the llm directory. |
|
cache_bits | A lower value reduces VRAM usage, but also affects generation speed and quality. | |
fast_tensors | Enabling reduces RAM usage and speeds up model loading. | |
flash_attention | Enabling reduces VRAM usage, not supported on cards with compute capability below 8.0 . |
|
max_seq_len | Max context, higher value equals higher VRAM usage. 0 will default to model config. |
|
Generator | Generates text based on the given prompt. Refer to SillyTavern for sampler parameters. | |
unload | Unloads the model after each generation to reduce VRAM usage. | |
stop_conditions | List of strings to stop generation on, e.g. ["\n"] to stop on newline. Leave empty to only stop on eos token. |
|
max_tokens | Max new tokens, 0 will use available context. |
|
Previewer | Displays generated text in the UI. | |
Replacer | Replaces variable names in brackets, e.g. [a] , with their values. |
An example workflow is embedded in the image below and can be opened in ComfyUI.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。