# nano-vllm-cpu

**Repository Path**: linzm1007/nano-vllm-cpu

## Basic Information

- **Project Name**: nano-vllm-cpu
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-01-03
- **Last Updated**: 2026-05-05

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Nano-vLLM-cpu

## Key Features

Nano-vLLM cpu版本

## Installation

```bash
pip install -r requirements.txt
```

## Model Download

To download the model weights manually, use the following command:

```bash
huggingface-cli download --resume-download Qwen/Qwen3-0.6B \
  --local-dir ~/huggingface/Qwen3-0.6B/ \
  --local-dir-use-symlinks False
```

## Quick Start

See `example.py` for usage. The API mirrors vLLM's interface with minor differences in the `LLM.generate` method:
```python
from nanovllm import LLM, SamplingParams

llm = LLM("/YOUR/MODEL/PATH", enforce_eager=True, tensor_parallel_size=1)
sampling_params = SamplingParams(temperature=0.6, max_tokens=256)
prompts = ["Hello, Nano-vLLM."]
outputs = llm.generate(prompts, sampling_params)
outputs[0]["text"]

```
![img.png](img.png)