ComfyUI ExLlamaV2 Nodes

A simple local text generator for ComfyUI using ExLlamaV2.

Installation

Clone the repository to custom_nodes and install the requirements:

git clone https://github.com/Zuellni/ComfyUI-ExLlama-Nodes custom_nodes/ComfyUI-ExLlamaV2-Nodes
pip install -r custom_nodes/ComfyUI-ExLlamaV2-Nodes/requirements.txt

Use wheels for ExLlamaV2 and FlashAttention on Windows:

pip install exllamav2-X.X.X+cuXXX.torch2.X.X-cp3XX-cp3XX-win_amd64.whl
pip install flash_attn-X.X.X+cuXXX.torch2.X.X-cp3XX-cp3XX-win_amd64.whl

Usage

Only EXL2, 4-bit GPTQ and unquantized models are supported. You can find them on Hugging Face.

To use a model with the nodes, you should clone its repository with git or manually download all the files and place them in models/llm. For example, if you want to download the 6-bit Llama-3-8B-Instruct, use the following command:

git install lfs
git clone https://huggingface.co/turboderp/Llama-3-8B-Instruct-exl2 -b 6.0bpw models/llm/Llama-3-8B-Instruct-exl2-6.0bpw

[!TIP] You can add your own llm path to the extra_model_paths.yaml file and put the models there instead.

Nodes

Loader	Loads models from the `llm` directory.
	cache_bits	A lower value reduces VRAM usage, but also affects generation speed and quality.
	fast_tensors	Enabling reduces RAM usage and speeds up model loading.
	flash_attention	Enabling reduces VRAM usage, not supported on cards with compute capability below `8.0`.
	max_seq_len	Max context, higher value equals higher VRAM usage. `0` will default to model config.
Generator	Generates text based on the given prompt. Refer to SillyTavern for sampler parameters.
	unload	Unloads the model after each generation to reduce VRAM usage.
	stop_conditions	List of strings to stop generation on, e.g. `["\n"]` to stop on newline. Leave empty to only stop on `eos` token.
	max_tokens	Max new tokens, `0` will use available context.
Previewer	Displays generated text in the UI.
Replacer	Replaces variable names in brackets, e.g. `[a]`, with their values.

Workflow

An example workflow is embedded in the image below and can be opened in ComfyUI.

workflow

comfyui_custom_nodes/ComfyUI-ExLlama-Nodes

ComfyUI ExLlamaV2 Nodes

Installation

Usage

Nodes

Workflow

About

Releases

Contributors

Language(Optional)

Activities

comfyui_custom_nodes/ComfyUI-ExLlama-Nodes .gitee-modal { width: 500px !important; }

ComfyUI ExLlamaV2 Nodes

Installation

Usage

Nodes

Workflow

About

Releases

Contributors

Language(Optional)

Activities

Search

comfyui_custom_nodes/ComfyUI-ExLlama-Nodes